All of lore.kernel.org
 help / color / mirror / Atom feed
From: Kalle Valo <kvalo@kernel.org>
To: Baochen Qiang <quic_bqiang@quicinc.com>
Cc: <ath12k@lists.infradead.org>,  <linux-wireless@vger.kernel.org>
Subject: Re: [PATCH] wifi: ath12k: fix kernel crash during resume
Date: Fri, 19 Apr 2024 07:18:05 +0300	[thread overview]
Message-ID: <87cyqmhtoi.fsf@kernel.org> (raw)
In-Reply-To: <20240419034034.2842-1-quic_bqiang@quicinc.com> (Baochen Qiang's message of "Fri, 19 Apr 2024 11:40:34 +0800")

Baochen Qiang <quic_bqiang@quicinc.com> writes:

> Currently during resume, QMI target memory is not properly handled, resulting
> in kernel crash in case DMA remap is not supported:
>
> BUG: Bad page state in process kworker/u16:54  pfn:36e80
> page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x36e80
> page dumped because: nonzero _refcount
> Call Trace:
>  bad_page
>  free_page_is_bad_report
>  __free_pages_ok
>  __free_pages
>  dma_direct_free
>  dma_free_attrs
>  ath12k_qmi_free_target_mem_chunk
>  ath12k_qmi_msg_mem_request_cb
>
> The reason is:
> Once ath12k module is loaded, firmware sends memory request to host. In case
> DMA remap not supported, ath12k refuses the first request due to failure in
> allocating with large segment size:
>
> ath12k_pci 0000:04:00.0: qmi firmware request memory request
> ath12k_pci 0000:04:00.0: qmi mem seg type 1 size 7077888
> ath12k_pci 0000:04:00.0: qmi mem seg type 4 size 8454144
> ath12k_pci 0000:04:00.0: qmi dma allocation failed (7077888 B type 1), will try later with small size
> ath12k_pci 0000:04:00.0: qmi delays mem_request 2
> ath12k_pci 0000:04:00.0: qmi firmware request memory request
>
> Later firmware comes back with more but small segments and allocation
> succeeds:
>
> ath12k_pci 0000:04:00.0: qmi mem seg type 1 size 524288
> ath12k_pci 0000:04:00.0: qmi mem seg type 1 size 524288
> ath12k_pci 0000:04:00.0: qmi mem seg type 1 size 524288
> ath12k_pci 0000:04:00.0: qmi mem seg type 1 size 524288
> ath12k_pci 0000:04:00.0: qmi mem seg type 1 size 524288
> ath12k_pci 0000:04:00.0: qmi mem seg type 1 size 524288
> ath12k_pci 0000:04:00.0: qmi mem seg type 1 size 524288
> ath12k_pci 0000:04:00.0: qmi mem seg type 1 size 262144
> ath12k_pci 0000:04:00.0: qmi mem seg type 1 size 524288
> ath12k_pci 0000:04:00.0: qmi mem seg type 1 size 524288
> ath12k_pci 0000:04:00.0: qmi mem seg type 1 size 524288
> ath12k_pci 0000:04:00.0: qmi mem seg type 1 size 524288
> ath12k_pci 0000:04:00.0: qmi mem seg type 1 size 524288
> ath12k_pci 0000:04:00.0: qmi mem seg type 4 size 524288
> ath12k_pci 0000:04:00.0: qmi mem seg type 4 size 524288
> ath12k_pci 0000:04:00.0: qmi mem seg type 4 size 524288
> ath12k_pci 0000:04:00.0: qmi mem seg type 4 size 524288
> ath12k_pci 0000:04:00.0: qmi mem seg type 4 size 524288
> ath12k_pci 0000:04:00.0: qmi mem seg type 4 size 524288
> ath12k_pci 0000:04:00.0: qmi mem seg type 4 size 524288
> ath12k_pci 0000:04:00.0: qmi mem seg type 4 size 524288
> ath12k_pci 0000:04:00.0: qmi mem seg type 4 size 524288
> ath12k_pci 0000:04:00.0: qmi mem seg type 4 size 524288
> ath12k_pci 0000:04:00.0: qmi mem seg type 4 size 524288
> ath12k_pci 0000:04:00.0: qmi mem seg type 4 size 524288
> ath12k_pci 0000:04:00.0: qmi mem seg type 4 size 524288
> ath12k_pci 0000:04:00.0: qmi mem seg type 4 size 524288
> ath12k_pci 0000:04:00.0: qmi mem seg type 4 size 524288
> ath12k_pci 0000:04:00.0: qmi mem seg type 4 size 524288
> ath12k_pci 0000:04:00.0: qmi mem seg type 4 size 65536
> ath12k_pci 0000:04:00.0: qmi mem seg type 1 size 524288
>
> Now ath12k is working. If suspend is triggered, firmware will be reloaded
> during resume. As same as before, firmware requests two large segments at
> first. In ath12k_qmi_msg_mem_request_cb() segment count and size are
> assigned:
>
> 	ab->qmi.mem_seg_count == 2
> 	ab->qmi.target_mem[0].size == 7077888
> 	ab->qmi.target_mem[1].size == 8454144
>
> Then allocation failed like before and ath12k_qmi_free_target_mem_chunk()
> is called to free all allocated segments. Note the first segment is skipped
> because its v.addr is cleared due to allocation failure:
>
> 	chunk->v.addr = dma_alloc_coherent()
>
> Also note that this leaks that segment because it has not been freed.
>
> While freeing the second segment, a size of 8454144 is passed to
> dma_free_coherent(). However remember that this segment is allocated at
> the first time firmware is loaded, before suspend. So its real size is
> 524288, much smaller than 8454144. As a result kernel found we are freeing
> some memory which is in use and thus crashed.
>
> So one possible fix would be to free those segments during suspend. This
> works because with them freed, ath12k_qmi_free_target_mem_chunk() does
> nothing: all segment addresses are NULL so dma_free_coherent() is not called.
>
> But note that ath11k has similar logic but never hits this issue. Reviewing
> code there shows the luck comes from QMI memory reuse logic. So the decision
> is to port it to ath12k. Like in ath11k, the crash is avoided by adding
> prev_size to target_mem_chunk structure and caching real segment size in it,
> then prev_size instead of current size is passed to dma_free_coherent(),
> no unexpected memory is freed now.
>
> Also reuse m3 buffer.
>
> Tested-on: WCN7850 hw2.0 PCI WLAN.HMT.1.0-03427-QCAHMTSWPL_V1.0_V2.0_SILICONZ-1.15378.4
> Tested-on: WCN7850 hw2.0 PCI WLAN.HMT.1.0.c5-00481-QCAHMTSWPL_V1.0_V2.0_SILICONZ-3
>
> Fixes: 64430ddfb132 ("wifi: ath12k: support suspend/resume")
> Signed-off-by: Baochen Qiang <quic_bqiang@quicinc.com>

To keep git bisect clean should this patch be applied before applying
the suspend patchset? No need resend or anything, just checking what
should be the correct order.

-- 
https://patchwork.kernel.org/project/linux-wireless/list/

https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches


  reply	other threads:[~2024-04-19  4:18 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-19  3:40 [PATCH] wifi: ath12k: fix kernel crash during resume Baochen Qiang
2024-04-19  4:18 ` Kalle Valo [this message]
2024-04-19  4:38   ` Kalle Valo
2024-04-22 13:12     ` Kalle Valo
2024-04-22 15:41       ` Jeff Johnson
2024-04-22 18:41         ` Jeff Johnson
2024-04-23  1:34           ` Jeff Johnson
2024-04-23  4:10             ` Kalle Valo
2024-04-23  5:09               ` Baochen Qiang
2024-04-23  9:25 ` Kalle Valo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87cyqmhtoi.fsf@kernel.org \
    --to=kvalo@kernel.org \
    --cc=ath12k@lists.infradead.org \
    --cc=linux-wireless@vger.kernel.org \
    --cc=quic_bqiang@quicinc.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.