From: Sasha Levin <sashal@kernel.org>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org
Cc: Baochen Qiang <quic_bqiang@quicinc.com>,
Kalle Valo <quic_kvalo@quicinc.com>,
Sasha Levin <sashal@kernel.org>,
kvalo@kernel.org, jjohnson@kernel.org,
linux-wireless@vger.kernel.org, ath12k@lists.infradead.org
Subject: [PATCH AUTOSEL 6.9 15/35] wifi: ath12k: fix kernel crash during resume
Date: Mon, 27 May 2024 10:11:20 -0400 [thread overview]
Message-ID: <20240527141214.3844331-15-sashal@kernel.org> (raw)
In-Reply-To: <20240527141214.3844331-1-sashal@kernel.org>
From: Baochen Qiang <quic_bqiang@quicinc.com>
[ Upstream commit 303c017821d88ebad887814114d4e5966d320b28 ]
Currently during resume, QMI target memory is not properly handled, resulting
in kernel crash in case DMA remap is not supported:
BUG: Bad page state in process kworker/u16:54 pfn:36e80
page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x36e80
page dumped because: nonzero _refcount
Call Trace:
bad_page
free_page_is_bad_report
__free_pages_ok
__free_pages
dma_direct_free
dma_free_attrs
ath12k_qmi_free_target_mem_chunk
ath12k_qmi_msg_mem_request_cb
The reason is:
Once ath12k module is loaded, firmware sends memory request to host. In case
DMA remap not supported, ath12k refuses the first request due to failure in
allocating with large segment size:
ath12k_pci 0000:04:00.0: qmi firmware request memory request
ath12k_pci 0000:04:00.0: qmi mem seg type 1 size 7077888
ath12k_pci 0000:04:00.0: qmi mem seg type 4 size 8454144
ath12k_pci 0000:04:00.0: qmi dma allocation failed (7077888 B type 1), will try later with small size
ath12k_pci 0000:04:00.0: qmi delays mem_request 2
ath12k_pci 0000:04:00.0: qmi firmware request memory request
Later firmware comes back with more but small segments and allocation
succeeds:
ath12k_pci 0000:04:00.0: qmi mem seg type 1 size 524288
ath12k_pci 0000:04:00.0: qmi mem seg type 1 size 524288
ath12k_pci 0000:04:00.0: qmi mem seg type 1 size 524288
ath12k_pci 0000:04:00.0: qmi mem seg type 1 size 524288
ath12k_pci 0000:04:00.0: qmi mem seg type 1 size 524288
ath12k_pci 0000:04:00.0: qmi mem seg type 1 size 524288
ath12k_pci 0000:04:00.0: qmi mem seg type 1 size 524288
ath12k_pci 0000:04:00.0: qmi mem seg type 1 size 262144
ath12k_pci 0000:04:00.0: qmi mem seg type 1 size 524288
ath12k_pci 0000:04:00.0: qmi mem seg type 1 size 524288
ath12k_pci 0000:04:00.0: qmi mem seg type 1 size 524288
ath12k_pci 0000:04:00.0: qmi mem seg type 1 size 524288
ath12k_pci 0000:04:00.0: qmi mem seg type 1 size 524288
ath12k_pci 0000:04:00.0: qmi mem seg type 4 size 524288
ath12k_pci 0000:04:00.0: qmi mem seg type 4 size 524288
ath12k_pci 0000:04:00.0: qmi mem seg type 4 size 524288
ath12k_pci 0000:04:00.0: qmi mem seg type 4 size 524288
ath12k_pci 0000:04:00.0: qmi mem seg type 4 size 524288
ath12k_pci 0000:04:00.0: qmi mem seg type 4 size 524288
ath12k_pci 0000:04:00.0: qmi mem seg type 4 size 524288
ath12k_pci 0000:04:00.0: qmi mem seg type 4 size 524288
ath12k_pci 0000:04:00.0: qmi mem seg type 4 size 524288
ath12k_pci 0000:04:00.0: qmi mem seg type 4 size 524288
ath12k_pci 0000:04:00.0: qmi mem seg type 4 size 524288
ath12k_pci 0000:04:00.0: qmi mem seg type 4 size 524288
ath12k_pci 0000:04:00.0: qmi mem seg type 4 size 524288
ath12k_pci 0000:04:00.0: qmi mem seg type 4 size 524288
ath12k_pci 0000:04:00.0: qmi mem seg type 4 size 524288
ath12k_pci 0000:04:00.0: qmi mem seg type 4 size 524288
ath12k_pci 0000:04:00.0: qmi mem seg type 4 size 65536
ath12k_pci 0000:04:00.0: qmi mem seg type 1 size 524288
Now ath12k is working. If suspend is triggered, firmware will be reloaded
during resume. As same as before, firmware requests two large segments at
first. In ath12k_qmi_msg_mem_request_cb() segment count and size are
assigned:
ab->qmi.mem_seg_count == 2
ab->qmi.target_mem[0].size == 7077888
ab->qmi.target_mem[1].size == 8454144
Then allocation failed like before and ath12k_qmi_free_target_mem_chunk()
is called to free all allocated segments. Note the first segment is skipped
because its v.addr is cleared due to allocation failure:
chunk->v.addr = dma_alloc_coherent()
Also note that this leaks that segment because it has not been freed.
While freeing the second segment, a size of 8454144 is passed to
dma_free_coherent(). However remember that this segment is allocated at
the first time firmware is loaded, before suspend. So its real size is
524288, much smaller than 8454144. As a result kernel found we are freeing
some memory which is in use and thus crashed.
So one possible fix would be to free those segments during suspend. This
works because with them freed, ath12k_qmi_free_target_mem_chunk() does
nothing: all segment addresses are NULL so dma_free_coherent() is not called.
But note that ath11k has similar logic but never hits this issue. Reviewing
code there shows the luck comes from QMI memory reuse logic. So the decision
is to port it to ath12k. Like in ath11k, the crash is avoided by adding
prev_size to target_mem_chunk structure and caching real segment size in it,
then prev_size instead of current size is passed to dma_free_coherent(),
no unexpected memory is freed now.
Also reuse m3 buffer.
Tested-on: WCN7850 hw2.0 PCI WLAN.HMT.1.0-03427-QCAHMTSWPL_V1.0_V2.0_SILICONZ-1.15378.4
Tested-on: WCN7850 hw2.0 PCI WLAN.HMT.1.0.c5-00481-QCAHMTSWPL_V1.0_V2.0_SILICONZ-3
Signed-off-by: Baochen Qiang <quic_bqiang@quicinc.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://msgid.link/20240419034034.2842-1-quic_bqiang@quicinc.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/net/wireless/ath/ath12k/core.c | 1 -
drivers/net/wireless/ath/ath12k/qmi.c | 29 +++++++++++++++++++++-----
drivers/net/wireless/ath/ath12k/qmi.h | 2 ++
3 files changed, 26 insertions(+), 6 deletions(-)
diff --git a/drivers/net/wireless/ath/ath12k/core.c b/drivers/net/wireless/ath/ath12k/core.c
index 391b6fb2bd426..bff4598de4035 100644
--- a/drivers/net/wireless/ath/ath12k/core.c
+++ b/drivers/net/wireless/ath/ath12k/core.c
@@ -1132,7 +1132,6 @@ static void ath12k_core_reset(struct work_struct *work)
ATH12K_RECOVER_START_TIMEOUT_HZ);
ath12k_hif_power_down(ab);
- ath12k_qmi_free_resource(ab);
ath12k_hif_power_up(ab);
ath12k_dbg(ab, ATH12K_DBG_BOOT, "reset started\n");
diff --git a/drivers/net/wireless/ath/ath12k/qmi.c b/drivers/net/wireless/ath/ath12k/qmi.c
index 92845ffff44ad..4964fd7e4cd72 100644
--- a/drivers/net/wireless/ath/ath12k/qmi.c
+++ b/drivers/net/wireless/ath/ath12k/qmi.c
@@ -2325,8 +2325,9 @@ static void ath12k_qmi_free_target_mem_chunk(struct ath12k_base *ab)
for (i = 0; i < ab->qmi.mem_seg_count; i++) {
if (!ab->qmi.target_mem[i].v.addr)
continue;
+
dma_free_coherent(ab->dev,
- ab->qmi.target_mem[i].size,
+ ab->qmi.target_mem[i].prev_size,
ab->qmi.target_mem[i].v.addr,
ab->qmi.target_mem[i].paddr);
ab->qmi.target_mem[i].v.addr = NULL;
@@ -2352,6 +2353,20 @@ static int ath12k_qmi_alloc_target_mem_chunk(struct ath12k_base *ab)
case M3_DUMP_REGION_TYPE:
case PAGEABLE_MEM_REGION_TYPE:
case CALDB_MEM_REGION_TYPE:
+ /* Firmware reloads in recovery/resume.
+ * In such cases, no need to allocate memory for FW again.
+ */
+ if (chunk->v.addr) {
+ if (chunk->prev_type == chunk->type &&
+ chunk->prev_size == chunk->size)
+ goto this_chunk_done;
+
+ /* cannot reuse the existing chunk */
+ dma_free_coherent(ab->dev, chunk->prev_size,
+ chunk->v.addr, chunk->paddr);
+ chunk->v.addr = NULL;
+ }
+
chunk->v.addr = dma_alloc_coherent(ab->dev,
chunk->size,
&chunk->paddr,
@@ -2370,6 +2385,10 @@ static int ath12k_qmi_alloc_target_mem_chunk(struct ath12k_base *ab)
chunk->type, chunk->size);
return -ENOMEM;
}
+
+ chunk->prev_type = chunk->type;
+ chunk->prev_size = chunk->size;
+this_chunk_done:
break;
default:
ath12k_warn(ab, "memory type %u not supported\n",
@@ -2675,10 +2694,6 @@ static int ath12k_qmi_m3_load(struct ath12k_base *ab)
size_t m3_len;
int ret;
- if (m3_mem->vaddr)
- /* m3 firmware buffer is already available in the DMA buffer */
- return 0;
-
if (ab->fw.m3_data && ab->fw.m3_len > 0) {
/* firmware-N.bin had a m3 firmware file so use that */
m3_data = ab->fw.m3_data;
@@ -2700,6 +2715,9 @@ static int ath12k_qmi_m3_load(struct ath12k_base *ab)
m3_len = fw->size;
}
+ if (m3_mem->vaddr)
+ goto skip_m3_alloc;
+
m3_mem->vaddr = dma_alloc_coherent(ab->dev,
m3_len, &m3_mem->paddr,
GFP_KERNEL);
@@ -2710,6 +2728,7 @@ static int ath12k_qmi_m3_load(struct ath12k_base *ab)
goto out;
}
+skip_m3_alloc:
memcpy(m3_mem->vaddr, m3_data, m3_len);
m3_mem->size = m3_len;
diff --git a/drivers/net/wireless/ath/ath12k/qmi.h b/drivers/net/wireless/ath/ath12k/qmi.h
index 6ee33c9851c6b..f34263d4bee88 100644
--- a/drivers/net/wireless/ath/ath12k/qmi.h
+++ b/drivers/net/wireless/ath/ath12k/qmi.h
@@ -96,6 +96,8 @@ struct ath12k_qmi_event_msg {
struct target_mem_chunk {
u32 size;
u32 type;
+ u32 prev_size;
+ u32 prev_type;
dma_addr_t paddr;
union {
void __iomem *ioaddr;
--
2.43.0
next prev parent reply other threads:[~2024-05-27 14:12 UTC|newest]
Thread overview: 39+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-05-27 14:11 [PATCH AUTOSEL 6.9 01/35] ssb: Fix potential NULL pointer dereference in ssb_device_uevent() Sasha Levin
2024-05-27 14:11 ` [PATCH AUTOSEL 6.9 02/35] selftests/bpf: Prevent client connect before server bind in test_tc_tunnel.sh Sasha Levin
2024-05-27 14:11 ` [PATCH AUTOSEL 6.9 03/35] selftests/bpf: Fix flaky test btf_map_in_map/lookup_update Sasha Levin
2024-05-27 14:11 ` [PATCH AUTOSEL 6.9 04/35] bpf: Avoid kfree_rcu() under lock in bpf_lpm_trie Sasha Levin
2024-05-27 14:11 ` [PATCH AUTOSEL 6.9 05/35] devlink: use kvzalloc() to allocate devlink instance resources Sasha Levin
2024-05-27 14:11 ` [PATCH AUTOSEL 6.9 06/35] batman-adv: bypass empty buckets in batadv_purge_orig_ref() Sasha Levin
2024-05-27 14:11 ` [PATCH AUTOSEL 6.9 07/35] wifi: rtw89: 8852c: add quirk to set PCI BER for certain platforms Sasha Levin
2024-05-27 14:11 ` [PATCH AUTOSEL 6.9 08/35] wifi: ath9k: work around memset overflow warning Sasha Levin
2024-05-27 14:11 ` [PATCH AUTOSEL 6.9 09/35] af_packet: avoid a false positive warning in packet_setsockopt() Sasha Levin
2024-05-27 14:11 ` [PATCH AUTOSEL 6.9 10/35] clocksource: Make watchdog and suspend-timing multiplication overflow safe Sasha Levin
2024-05-27 14:11 ` [PATCH AUTOSEL 6.9 11/35] ACPI: x86: Add PNP_UART1_SKIP quirk for Lenovo Blade2 tablets Sasha Levin
2024-05-27 14:11 ` [PATCH AUTOSEL 6.9 12/35] net: sfp: add quirk for another multigig RollBall transceiver Sasha Levin
2024-05-27 14:54 ` Marek Behún
2024-06-19 14:28 ` Sasha Levin
2024-05-27 14:11 ` [PATCH AUTOSEL 6.9 13/35] drop_monitor: replace spin_lock by raw_spin_lock Sasha Levin
2024-05-27 14:11 ` [PATCH AUTOSEL 6.9 14/35] ACPI: resource: Do IRQ override on GMxBGxx (XMG APEX 17 M23) Sasha Levin
2024-05-27 14:11 ` Sasha Levin [this message]
2024-05-27 14:11 ` [PATCH AUTOSEL 6.9 16/35] scsi: qedi: Fix crash while reading debugfs attribute Sasha Levin
2024-05-27 14:11 ` [PATCH AUTOSEL 6.9 17/35] net: sfp: enhance quirk for Fibrestore 2.5G copper SFP module Sasha Levin
2024-05-27 14:11 ` [PATCH AUTOSEL 6.9 18/35] net: sfp: add quirk for ATS SFP-GE-T 1000Base-TX module Sasha Levin
2024-05-27 14:11 ` [PATCH AUTOSEL 6.9 19/35] net/sched: fix false lockdep warning on qdisc root lock Sasha Levin
2024-05-27 14:11 ` [PATCH AUTOSEL 6.9 20/35] arm64/sysreg: Update PIE permission encodings Sasha Levin
2024-05-27 14:11 ` [PATCH AUTOSEL 6.9 21/35] kselftest: arm64: Add a null pointer check Sasha Levin
2024-05-27 14:11 ` [PATCH AUTOSEL 6.9 22/35] net: dsa: realtek: keep default LED state in rtl8366rb Sasha Levin
2024-05-27 14:11 ` [PATCH AUTOSEL 6.9 23/35] net: dsa: realtek: do not assert reset on remove Sasha Levin
2024-05-27 14:11 ` [PATCH AUTOSEL 6.9 24/35] ACPI: resource: Skip IRQ override on Asus Vivobook Pro N6506MV Sasha Levin
2024-05-27 14:11 ` [PATCH AUTOSEL 6.9 25/35] netpoll: Fix race condition in netpoll_owner_active Sasha Levin
2024-05-27 14:11 ` [PATCH AUTOSEL 6.9 26/35] wifi: ath12k: fix the problem that down grade phy mode operation Sasha Levin
2024-05-27 14:11 ` [PATCH AUTOSEL 6.9 27/35] wifi: mt76: mt7921s: fix potential hung tasks during chip recovery Sasha Levin
2024-05-27 14:11 ` [PATCH AUTOSEL 6.9 28/35] HID: Add quirk for Logitech Casa touchpad Sasha Levin
2024-05-27 14:11 ` [PATCH AUTOSEL 6.9 29/35] HID: asus: fix more n-key report descriptors if n-key quirked Sasha Levin
2024-05-27 14:11 ` [PATCH AUTOSEL 6.9 30/35] ACPI: video: Add backlight=native quirk for Lenovo Slim 7 16ARH7 Sasha Levin
2024-05-27 14:11 ` [PATCH AUTOSEL 6.9 31/35] HID: bpf: add in-tree HID-BPF fix for the HP Elite Presenter Mouse Sasha Levin
2024-05-27 14:50 ` Benjamin Tissoires
2024-06-19 14:29 ` Sasha Levin
2024-05-27 14:11 ` [PATCH AUTOSEL 6.9 32/35] bpf: avoid uninitialized warnings in verifier_global_subprogs.c Sasha Levin
2024-05-27 14:11 ` [PATCH AUTOSEL 6.9 33/35] selftests: net: fix timestamp not arriving in cmsg_time.sh Sasha Levin
2024-05-27 14:11 ` [PATCH AUTOSEL 6.9 34/35] net: ena: Add validation for completion descriptors consistency Sasha Levin
2024-05-27 14:11 ` [PATCH AUTOSEL 6.9 35/35] Bluetooth: ath3k: Fix multiple issues reported by checkpatch.pl Sasha Levin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240527141214.3844331-15-sashal@kernel.org \
--to=sashal@kernel.org \
--cc=ath12k@lists.infradead.org \
--cc=jjohnson@kernel.org \
--cc=kvalo@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-wireless@vger.kernel.org \
--cc=quic_bqiang@quicinc.com \
--cc=quic_kvalo@quicinc.com \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox