All of lore.kernel.org
 help / color / mirror / Atom feed
From: Kalle Valo <kvalo@codeaurora.org>
To: Loic Poulain <loic.poulain@linaro.org>
Cc: linux-arm-msm <linux-arm-msm@vger.kernel.org>,
	Bhaumik Bhatt <bbhatt@codeaurora.org>,
	Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>,
	ath11k@lists.infradead.org, Hemant Kumar <hemantk@codeaurora.org>
Subject: Re: [regression] mhi: rmmod ath11k_pci crashing on v5.11
Date: Tue, 09 Feb 2021 17:25:37 +0200	[thread overview]
Message-ID: <87y2fxtg9a.fsf@codeaurora.org> (raw)
In-Reply-To: <CAMZdPi9pFHA-p3-e+-HNp8y3QPwg7GOgDucJ+HG8ETtxqZ=_9A@mail.gmail.com> (Loic Poulain's message of "Tue, 9 Feb 2021 16:21:28 +0100")

Loic Poulain <loic.poulain@linaro.org> writes:

> On Tue, 9 Feb 2021 at 15:48, Kalle Valo <kvalo@codeaurora.org> wrote:
>>
>> Hi Loic,
>>
>> I noticed that v5.11-rc6 was crashing on my ath11k test box with
>> QCA6390. The box was down for few weeks so I only noticed it late in the
>> cycle. After some manual testing I found out that reverting this commit
>> fixes the issue:
>>
>> a7f422f2f89e bus: mhi: Fix channel close issue on driver remove
>>
>> The crash happens when I issue 'sudo rmmod ath11k_pci' and it happens
>> every time. Through netconsole I get:
>>
>> Feb 9 16:43:30 nuc1 [ 313.202778] ath11k_pci 0000:06:00.0: qmi
>> failed set mode request, mode: 4, err = -110
>> Feb 9 16:43:30 nuc1 [ 313.202932] ath11k_pci 0000:06:00.0: qmi
>> failed to send wlan mode off
>> Feb  9 16:43:30 nuc1 [  313.225017] ------------[ cut here ]------------
>> Feb 9 16:43:30 nuc1 [ 313.225118] DMA-API: ath11k_pci 0000:06:00.0:
>> device driver tries to free DMA memory it has not allocated [device
>> address=0x00000000fffbc000] [size=2047 bytes]
>> Feb 9 16:43:30 nuc1 [ 313.225146] WARNING: CPU: 2 PID: 94 at
>> kernel/dma/debug.c:963 check_unmap+0x54a/0x8b0
>> Feb 9 16:43:30 nuc1 [ 313.225173] Modules linked in: ath11k_pci(-)
>> ath11k mac80211 libarc4 cfg80211 qmi_helpers qrtr_mhi mhi qrtr ns
>> mos7840 usbserial nvme nvme_core
>> Feb 9 16:43:30 nuc1 [ 313.225222] CPU: 2 PID: 94 Comm: kworker/u17:0
>> Not tainted 5.11.0-rc6 #362
>> Feb 9 16:43:30 nuc1 [ 313.225243] Hardware name: Intel(R) Client
>> Systems NUC8i7HVK/NUC8i7HVB, BIOS HNKBLi70.86A.0049.2018.0801.1601
>> 08/01/2018
>> Feb  9 16:43:30 nuc1 [  313.225263] Workqueue: mhi_hiprio_wq mhi_pm_st_worker [mhi]
>> Feb  9 16:43:30 nuc1 [  313.225290] RIP: 0010:check_unmap+0x54a/0x8b0
>> Feb 9 16:43:30 nuc1 [ 313.225312] Code: 4d 85 e4 75 03 4c 8b 27 4c
>> 89 04 24 e8 8f 78 66 00 4c 8b 04 24 48 89 c6 4c 89 e9 4c 89 e2 48 c7
>> c7 c8 be 16 8f e8 26 39 ae 00 <0f> 0b 44 8b 1d 6d c2 9b 01 45 85 db
>> 0f 84 5f 02 00 00 48 83 c4 18
>> Feb  9 16:43:30 nuc1 [  313.225333] RSP: 0018:ffffbab5c08f3ab0 EFLAGS: 00010282
>> Feb 9 16:43:30 nuc1 [ 313.225355] RAX: 0000000000000000 RBX:
>> 00000000fffbc000 RCX: ffff99dbf55d9fb8
>> Feb 9 16:43:30 nuc1 [ 313.225375] RDX: 00000000ffffffd8 RSI:
>> 0000000000000027 RDI: ffff99dbf55d9fb0
>> Feb 9 16:43:30 nuc1 [ 313.225395] RBP: ffffbab5c08f3b00 R08:
>> 0000000000000001 R09: 0000000000000000
>> Feb 9 16:43:30 nuc1 [ 313.225415] R10: 0000000000000003 R11:
>> 3fffffffffffffff R12: ffff99da84c525d0
>> Feb 9 16:43:30 nuc1 [ 313.225434] R13: 00000000fffbc000 R14:
>> ffffffff90b96c90 R15: 0000000000000000
>> Feb 9 16:43:30 nuc1 [ 313.225453] FS: 0000000000000000(0000)
>> GS:ffff99dbf5400000(0000) knlGS:0000000000000000
>> Feb 9 16:43:30 nuc1 [ 313.225479] CS: 0010 DS: 0000 ES: 0000 CR0:
>> 0000000080050033
>> Feb 9 16:43:30 nuc1 [ 313.225500] CR2: 0000556d03a34250 CR3:
>> 000000010d9e2003 CR4: 00000000003706e0
>> Feb  9 16:43:30 nuc1 [  313.225520] Call Trace:
>> Feb  9 16:43:30 nuc1 [  313.225541]  ? __lock_acquire+0x3bd/0x6d0
>> Feb  9 16:43:30 nuc1 [  313.225565]  debug_dma_free_coherent+0xb0/0xf0
>> Feb  9 16:43:30 nuc1 [  313.225594]  ? mhi_driver_remove+0x11d/0x290 [mhi]
>> Feb  9 16:43:30 nuc1 [  313.225620]  ? __mutex_lock+0x6ca/0x8f0
>> Feb  9 16:43:30 nuc1 [  313.225643]  ? qcom_mhi_qrtr_remove+0x18/0x30 [qrtr_mhi]
>> Feb  9 16:43:30 nuc1 [  313.225668]  dma_free_attrs+0x48/0xb0
>> Feb  9 16:43:30 nuc1 [  313.225710]  mhi_driver_remove+0x21e/0x290 [mhi]
>> Feb  9 16:43:30 nuc1 [  313.225742]  __device_release_driver+0x17b/0x230
>
> Ok, I think it's because there are two paths leading to
> 'mhi_deinit_chan_ctxt' and causing double page free (driver's remove
> callback via channel_unprepare and mhi_driver_remove via deinit loop).
> Checking and going to provide a fix.

Great, thank you. Feel free to send me any test patches, it's very easy
for me to reproduce the crash.

-- 
https://patchwork.kernel.org/project/linux-wireless/list/

https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches

-- 
ath11k mailing list
ath11k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath11k

WARNING: multiple messages have this Message-ID (diff)
From: Kalle Valo <kvalo@codeaurora.org>
To: Loic Poulain <loic.poulain@linaro.org>
Cc: linux-arm-msm <linux-arm-msm@vger.kernel.org>,
	Hemant Kumar <hemantk@codeaurora.org>,
	ath11k@lists.infradead.org, Bhaumik Bhatt <bbhatt@codeaurora.org>,
	Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Subject: Re: [regression] mhi: rmmod ath11k_pci crashing on v5.11
Date: Tue, 09 Feb 2021 17:25:37 +0200	[thread overview]
Message-ID: <87y2fxtg9a.fsf@codeaurora.org> (raw)
In-Reply-To: <CAMZdPi9pFHA-p3-e+-HNp8y3QPwg7GOgDucJ+HG8ETtxqZ=_9A@mail.gmail.com> (Loic Poulain's message of "Tue, 9 Feb 2021 16:21:28 +0100")

Loic Poulain <loic.poulain@linaro.org> writes:

> On Tue, 9 Feb 2021 at 15:48, Kalle Valo <kvalo@codeaurora.org> wrote:
>>
>> Hi Loic,
>>
>> I noticed that v5.11-rc6 was crashing on my ath11k test box with
>> QCA6390. The box was down for few weeks so I only noticed it late in the
>> cycle. After some manual testing I found out that reverting this commit
>> fixes the issue:
>>
>> a7f422f2f89e bus: mhi: Fix channel close issue on driver remove
>>
>> The crash happens when I issue 'sudo rmmod ath11k_pci' and it happens
>> every time. Through netconsole I get:
>>
>> Feb 9 16:43:30 nuc1 [ 313.202778] ath11k_pci 0000:06:00.0: qmi
>> failed set mode request, mode: 4, err = -110
>> Feb 9 16:43:30 nuc1 [ 313.202932] ath11k_pci 0000:06:00.0: qmi
>> failed to send wlan mode off
>> Feb  9 16:43:30 nuc1 [  313.225017] ------------[ cut here ]------------
>> Feb 9 16:43:30 nuc1 [ 313.225118] DMA-API: ath11k_pci 0000:06:00.0:
>> device driver tries to free DMA memory it has not allocated [device
>> address=0x00000000fffbc000] [size=2047 bytes]
>> Feb 9 16:43:30 nuc1 [ 313.225146] WARNING: CPU: 2 PID: 94 at
>> kernel/dma/debug.c:963 check_unmap+0x54a/0x8b0
>> Feb 9 16:43:30 nuc1 [ 313.225173] Modules linked in: ath11k_pci(-)
>> ath11k mac80211 libarc4 cfg80211 qmi_helpers qrtr_mhi mhi qrtr ns
>> mos7840 usbserial nvme nvme_core
>> Feb 9 16:43:30 nuc1 [ 313.225222] CPU: 2 PID: 94 Comm: kworker/u17:0
>> Not tainted 5.11.0-rc6 #362
>> Feb 9 16:43:30 nuc1 [ 313.225243] Hardware name: Intel(R) Client
>> Systems NUC8i7HVK/NUC8i7HVB, BIOS HNKBLi70.86A.0049.2018.0801.1601
>> 08/01/2018
>> Feb  9 16:43:30 nuc1 [  313.225263] Workqueue: mhi_hiprio_wq mhi_pm_st_worker [mhi]
>> Feb  9 16:43:30 nuc1 [  313.225290] RIP: 0010:check_unmap+0x54a/0x8b0
>> Feb 9 16:43:30 nuc1 [ 313.225312] Code: 4d 85 e4 75 03 4c 8b 27 4c
>> 89 04 24 e8 8f 78 66 00 4c 8b 04 24 48 89 c6 4c 89 e9 4c 89 e2 48 c7
>> c7 c8 be 16 8f e8 26 39 ae 00 <0f> 0b 44 8b 1d 6d c2 9b 01 45 85 db
>> 0f 84 5f 02 00 00 48 83 c4 18
>> Feb  9 16:43:30 nuc1 [  313.225333] RSP: 0018:ffffbab5c08f3ab0 EFLAGS: 00010282
>> Feb 9 16:43:30 nuc1 [ 313.225355] RAX: 0000000000000000 RBX:
>> 00000000fffbc000 RCX: ffff99dbf55d9fb8
>> Feb 9 16:43:30 nuc1 [ 313.225375] RDX: 00000000ffffffd8 RSI:
>> 0000000000000027 RDI: ffff99dbf55d9fb0
>> Feb 9 16:43:30 nuc1 [ 313.225395] RBP: ffffbab5c08f3b00 R08:
>> 0000000000000001 R09: 0000000000000000
>> Feb 9 16:43:30 nuc1 [ 313.225415] R10: 0000000000000003 R11:
>> 3fffffffffffffff R12: ffff99da84c525d0
>> Feb 9 16:43:30 nuc1 [ 313.225434] R13: 00000000fffbc000 R14:
>> ffffffff90b96c90 R15: 0000000000000000
>> Feb 9 16:43:30 nuc1 [ 313.225453] FS: 0000000000000000(0000)
>> GS:ffff99dbf5400000(0000) knlGS:0000000000000000
>> Feb 9 16:43:30 nuc1 [ 313.225479] CS: 0010 DS: 0000 ES: 0000 CR0:
>> 0000000080050033
>> Feb 9 16:43:30 nuc1 [ 313.225500] CR2: 0000556d03a34250 CR3:
>> 000000010d9e2003 CR4: 00000000003706e0
>> Feb  9 16:43:30 nuc1 [  313.225520] Call Trace:
>> Feb  9 16:43:30 nuc1 [  313.225541]  ? __lock_acquire+0x3bd/0x6d0
>> Feb  9 16:43:30 nuc1 [  313.225565]  debug_dma_free_coherent+0xb0/0xf0
>> Feb  9 16:43:30 nuc1 [  313.225594]  ? mhi_driver_remove+0x11d/0x290 [mhi]
>> Feb  9 16:43:30 nuc1 [  313.225620]  ? __mutex_lock+0x6ca/0x8f0
>> Feb  9 16:43:30 nuc1 [  313.225643]  ? qcom_mhi_qrtr_remove+0x18/0x30 [qrtr_mhi]
>> Feb  9 16:43:30 nuc1 [  313.225668]  dma_free_attrs+0x48/0xb0
>> Feb  9 16:43:30 nuc1 [  313.225710]  mhi_driver_remove+0x21e/0x290 [mhi]
>> Feb  9 16:43:30 nuc1 [  313.225742]  __device_release_driver+0x17b/0x230
>
> Ok, I think it's because there are two paths leading to
> 'mhi_deinit_chan_ctxt' and causing double page free (driver's remove
> callback via channel_unprepare and mhi_driver_remove via deinit loop).
> Checking and going to provide a fix.

Great, thank you. Feel free to send me any test patches, it's very easy
for me to reproduce the crash.

-- 
https://patchwork.kernel.org/project/linux-wireless/list/

https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches

  reply	other threads:[~2021-02-09 15:25 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-09 14:48 [regression] mhi: rmmod ath11k_pci crashing on v5.11 Kalle Valo
2021-02-09 14:48 ` Kalle Valo
2021-02-09 15:21 ` Loic Poulain
2021-02-09 15:21   ` Loic Poulain
2021-02-09 15:25   ` Kalle Valo [this message]
2021-02-09 15:25     ` Kalle Valo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87y2fxtg9a.fsf@codeaurora.org \
    --to=kvalo@codeaurora.org \
    --cc=ath11k@lists.infradead.org \
    --cc=bbhatt@codeaurora.org \
    --cc=hemantk@codeaurora.org \
    --cc=linux-arm-msm@vger.kernel.org \
    --cc=loic.poulain@linaro.org \
    --cc=manivannan.sadhasivam@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.