* Re: ath11k swiotlb buffer is full (on IMX8M with 4GiB DRAM)
@ 2024-11-23 0:43 Tim Harvey
2024-11-25 7:23 ` Baochen Qiang
0 siblings, 1 reply; 17+ messages in thread
From: Tim Harvey @ 2024-11-23 0:43 UTC (permalink / raw)
To: Baochen Qiang; +Cc: ath11k, linux-wireless, Fabio Estevam
On Thu, Nov 21, 2024 at 9:51 PM Baochen Qiang <quic_bqiang@quicinc.com> wrote:
>
>
>
> On 11/22/2024 5:50 AM, Tim Harvey wrote:
> > On Tue, Nov 19, 2024 at 6:32 PM Baochen Qiang <quic_bqiang@quicinc.com> wrote:
> >>
> >>
> >>
> >> On 11/20/2024 4:16 AM, Tim Harvey wrote:
> >>> Greetings,
> >>>
> >>> I've got an ath11k card that is failing to init on an IMX8MM system
> >>> with 4GB of DRAM:
> >>> [ 7.551582] ath11k_pci 0000:01:00.0: BAR 0 [mem
> >>> 0x18000000-0x181fffff 64bit]: assigned
> >>> [ 7.551713] ath11k_pci 0000:01:00.0: enabling device (0000 -> 0002)
> >>> [ 7.552401] ath11k_pci 0000:01:00.0: MSI vectors: 16
> >>> [ 7.552440] ath11k_pci 0000:01:00.0: qcn9074 hw1.0
> >>> [ 7.887186] mhi mhi0: Loaded FW: ath11k/QCN9074/hw1.0/amss.bin,
> >>> sha256: 5ee1b7b204541b5f99984f21d694ececaec08fbce1b520ffe6fe740b02a4afd7
> >>> [ 8.435964] ath11k_pci 0000:01:00.0: chip_id 0x0 chip_family 0x0
> >>> board_id 0xff soc_id 0xffffffff
> >>> [ 8.435991] ath11k_pci 0000:01:00.0: fw_version 0x270206d0
> >>> fw_build_timestamp 2022-08-04 12:48 fw_build_id
> >>> WLAN.HK.2.7.0.1-01744-QCAHKSWPL_SILICONZ-1
> >>> [ 8.441700] ath11k_pci 0000:01:00.0: Loaded FW:
> >>> ath11k/QCN9074/hw1.0/board-2.bin, sha256:
> >>> dbf0ca14aa1229eccd48f26f1026901b9718b143bd30b51b8ea67c84ba6207f1
> >>> [ 9.753764] ath11k_pci 0000:01:00.0: Loaded FW:
> >>> ath11k/QCN9074/hw1.0/m3.bin, sha256:
> >>> b6d957f335073a15a8de809398e1506f0200a08747eaf7189c843cf519ffc1de
> >>> [ 9.789791] ath11k_pci 0000:01:00.0: swiotlb buffer is full (sz:
> >>> 1048583 bytes), total 32768 (slots), used 2528 (slots)
> >>> [ 9.789853] ath11k_pci 0000:01:00.0: failed to set up tcl_comp ring (0) :-12
> >>> [ 9.790238] ath11k_pci 0000:01:00.0: failed to init DP: -12
> >>> root@noble-venice:~# cat /proc/cmdline
> >>> console=ttymxc1,115200 earlycon=ec_imx6q,0x30890000,115200
> >>> root=PARTUUID=5cdde84f-01 rootwait net.ifnames=0 cma=196M
> >>>
> >>> The IMX8MM's DRAM base is at 1GB so anything above 3GB hits the 32bit
> >>> address boundary. If I pass in a mem=3096M the device registers just
> >>> fine.
> >> yeah ... that parameter makes kernel alloc memory below 32bit boundary, thus swiotlb is not necessary.
> >
> > Hi Baochen,
> >
> > Yes, that makes sense as I step through the code. On IMX8M with DRAM
> > 3GB or less dma_capable(...) is true so swiotlb bounce buffers are not
> > needed.
> >
> >>
> >>>
> >>> I found this to be the case with modern kernels however I found
> >>> differing behavior with older kernels:
> >>> - 6.6 and 6.1 the device registers with 4GB DRAM but crashes on client connect
> >>> - 5.15 devices registers with 4GB DRAM and appears to work just fine
> >> are you using Linus' tree or the stable tree?
> >>
> >
> > For 6.6 I tested stable.
> can you try Linus's tree ? as I know the stable tree is possible to miss some important fix.
>
> >
> > This likely has something to do with commit dbd73acb22d8 ("wifi:
> > ath11k: enable 36 bit mask for stream DMA") but it would seem to me
> > that patch was trying to avoid the entire 32bit DMA limitation. Maybe
> > that patch sets the ath11k device DMA mask to 36 bits but maybe the
> > IMX8M PCI DMA is only capable of 32bits?
> that patch is making situation better, not worse. that said, it helps to avoid swiotlb in
> ath11k DMA, rather than to get it involved.
>
Yes, that patch would be an improvement on systems capable of
addressing 64bit memory but not on the IMX8M which is seemingly
capable of only 32bit DMA over PCI.
> >
> >>>
> >>> Could anyone explain what is going on here? Obviously there have been
> >>> changes at some point to start using swiotlb which I believe was all
> >>> about avoiding 32bit DMA limitations but I'm not clear how I should be
> >>> configuring this for IMX8MM with 4GB DRAM. Maybe my kernel IOMMU
> >>> configuration is incorrect somehow?
> >> there are quite some options associated with IOMMU, not sure which one might be causing this. But basically you may check:
> >>
> >> CONFIG_IOMMU_IOVA
> >> CONFIG_IOMMU_API
> >> CONFIG_IOMMU_SUPPORT
> >> CONFIG_IOMMU_DMA=y
> >>
> >
> > These are enabled which I believe appropriate for IMX8M. If I want to
> > utilize the full 4GB DRAM on IMX then I must use IOMMU and swiotlb
> > which would mean a performance hit due to copying mem to/from bounce
> > buffers not to mention the fact that I can't figure out how to
> > configure the system to avoid the 'swiotlb swiotlb buffer is full'
> > issue.
My statement regarding needing an IOMMU above is wrong; apparently the
IMX8M SoC's don't have an IOMMU but the fact I have it enabled in the
kernel should be a don't-care. If I understand swiotlb correctly, if I
did have an IOMMU then it would be used instead of swiotlb.
> >
> > Enabling CONFIG_SWIOTLB_DYNAMIC does not help nor does increasing the
> > number of slots - it has something to do with the number/size of DMA
> > buffers that ath11k is asking for:
> yeah, ath11k asks for fixed size DMA buffer regardless of that config.
>
> > # dmesg | grep swiotlb_tbl_map_single
> > [ 5.237731] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 65535B index= 16384 (slots=32768/ 32)
> > [ 5.247519] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 65535B index= 16416 (slots=32768/ 64)
> > [ 5.261794] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 65535B index= 16448 (slots=32768/ 96)
> > [ 5.275114] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 65535B index= 16480 (slots=32768/ 128)
> > [ 5.287757] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 65535B index= 16512 (slots=32768/ 160)
> > [ 5.299688] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 65535B index= 16544 (slots=32768/ 192)
> > [ 5.312482] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 65535B index= 16576 (slots=32768/ 224)
> > [ 5.324493] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 65535B index= 16608 (slots=32768/ 256)
> > [ 5.337001] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 65535B index= 16640 (slots=32768/ 288)
> > [ 5.346754] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 65535B index= 16672 (slots=32768/ 320)
> > [ 5.356571] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 65535B index= 16704 (slots=32768/ 352)
> > [ 5.366372] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 65535B index= 16736 (slots=32768/ 384)
> > [ 5.376164] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 65535B index= 16768 (slots=32768/ 416)
> > [ 5.385944] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 65535B index= 16800 (slots=32768/ 448)
> > [ 5.395712] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 65535B index= 16832 (slots=32768/ 480)
> > [ 5.408270] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 65535B index= 16864 (slots=32768/ 512)
> > [ 5.419768] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 65535B index= 16896 (slots=32768/ 544)
> > [ 5.430966] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 65535B index= 16928 (slots=32768/ 576)
> > [ 5.442368] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 65535B index= 16960 (slots=32768/ 608)
> > [ 5.452422] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 65535B index= 16992 (slots=32768/ 640)
> > [ 5.463507] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 65535B index= 17024 (slots=32768/ 672)
> > [ 5.473536] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 65535B index= 17056 (slots=32768/ 704)
> > [ 5.485661] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 65535B index= 17088 (slots=32768/ 736)
> > [ 5.495404] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 65535B index= 17120 (slots=32768/ 768)
> > [ 5.509626] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 65535B index= 17152 (slots=32768/ 800)
> > [ 5.519353] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 65535B index= 17184 (slots=32768/ 832)
> > [ 5.529077] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 65535B index= 17216 (slots=32768/ 864)
> > [ 5.538799] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 65535B index= 17248 (slots=32768/ 896)
> > [ 5.548517] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 65535B index= 17280 (slots=32768/ 928)
> > [ 5.558238] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 65535B index= 17312 (slots=32768/ 960)
> > [ 5.567965] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 65535B index= 17344 (slots=32768/ 992)
> > [ 5.578943] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 65535B index= 0 (slots=32768/ 992)
> > [ 5.578964] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 52B index= 8192 (slots=32768/ 993)
> > [ 5.599793] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 65535B index= 32 (slots=32768/ 992)
> > [ 5.599861] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 68B index= 8193 (slots=32768/ 993)
> > [ 5.609589] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 65535B index= 64 (slots=32768/ 993)
> > [ 5.628921] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 65535B index= 96 (slots=32768/ 992)
> > [ 5.638703] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 68B index= 17376 (slots=32768/ 993)
> > [ 5.649602] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 65535B index= 128 (slots=32768/ 992)
> > [ 5.659389] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 65535B index= 160 (slots=32768/ 992)
> > [ 5.674038] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 96B index= 17377 (slots=32768/ 993)
> > [ 5.685016] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 65535B index= 192 (slots=32768/ 992)
> > [ 5.694819] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 65535B index= 224 (slots=32768/ 992)
> > [ 5.694831] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 52B index= 17378 (slots=32768/ 993)
> > [ 5.714194] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 40B index= 17379 (slots=32768/ 994)
> > [ 5.725089] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 65535B index= 256 (slots=32768/ 992)
> > [ 5.753507] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 6224B index= 17380 (slots=32768/ 996)
> > [ 5.764668] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 65535B index= 288 (slots=32768/ 992)
> > [ 5.774456] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 65535B index= 320 (slots=32768/ 992)
> > [ 5.774620] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 6224B index= 17384 (slots=32768/ 996)
> > [ 5.795091] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 65535B index= 352 (slots=32768/ 992)
> > [ 5.795241] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 6224B index= 17388 (slots=32768/ 996)
> > [ 5.815724] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 65535B index= 384 (slots=32768/ 992)
> > [ 5.815884] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 6224B index= 17392 (slots=32768/ 996)
> > [ 5.836357] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 65535B index= 416 (slots=32768/ 992)
> > [ 5.836368] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 52B index= 8194 (slots=32768/ 993)
> > [ 5.855856] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 6224B index= 17396 (slots=32768/ 997)
> > [ 5.866818] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 65535B index= 448 (slots=32768/ 992)
> > [ 5.866978] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 6224B index= 17400 (slots=32768/ 996)
> > [ 5.887451] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 65535B index= 480 (slots=32768/ 992)
> > [ 5.897231] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 65535B index= 512 (slots=32768/ 992)
> > [ 5.897389] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 6224B index= 17404 (slots=32768/ 996)
> > [ 5.917866] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 65535B index= 544 (slots=32768/ 992)
> > [ 5.918026] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 6224B index= 17408 (slots=32768/ 996)
> > [ 5.938489] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 65535B index= 576 (slots=32768/ 992)
> > [ 5.938642] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 6224B index= 17412 (slots=32768/ 996)
> > [ 5.959121] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 65535B index= 608 (slots=32768/ 992)
> > [ 5.959135] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 52B index= 8195 (slots=32768/ 993)
> > [ 5.978619] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 6224B index= 17416 (slots=32768/ 997)
> > [ 5.989588] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 65535B index= 640 (slots=32768/ 992)
> > [ 5.989738] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 6224B index= 17420 (slots=32768/ 996)
> > [ 6.010215] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 65535B index= 672 (slots=32768/ 992)
> > [ 6.020001] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 65535B index= 704 (slots=32768/ 992)
> > [ 6.020158] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 6224B index= 17424 (slots=32768/ 996)
> > [ 6.040643] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 65535B index= 736 (slots=32768/ 992)
> > [ 6.040798] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 6224B index= 17428 (slots=32768/ 996)
> > [ 6.061287] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 65535B index= 768 (slots=32768/ 992)
> > [ 6.061437] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 6224B index= 17432 (slots=32768/ 996)
> > [ 6.081918] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 65535B index= 800 (slots=32768/ 992)
> > [ 6.081929] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 52B index= 8196 (slots=32768/ 993)
> > [ 6.101409] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 6224B index= 17436 (slots=32768/ 997)
> > [ 6.112375] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 65535B index= 832 (slots=32768/ 992)
> > [ 6.112528] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 6224B index= 17440 (slots=32768/ 996)
> > [ 6.133004] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 65535B index= 864 (slots=32768/ 992)
> > [ 6.142785] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 65535B index= 896 (slots=32768/ 992)
> > [ 6.142949] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 6224B index= 17444 (slots=32768/ 996)
> > [ 6.163426] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 65535B index= 928 (slots=32768/ 992)
> > [ 6.163576] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 6224B index= 17448 (slots=32768/ 996)
> > [ 6.184058] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 65535B index= 960 (slots=32768/ 992)
> > [ 6.184208] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 6224B index= 17452 (slots=32768/ 996)
> > [ 6.204691] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 65535B index= 992 (slots=32768/ 992)
> > [ 6.204704] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 52B index= 8197 (slots=32768/ 993)
> > [ 6.224183] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 6224B index= 17456 (slots=32768/ 997)
> > [ 6.235148] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 65535B index= 1024 (slots=32768/ 992)
> > [ 6.235308] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 6224B index= 17460 (slots=32768/ 996)
> > [ 6.255777] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 65535B index= 1056 (slots=32768/ 992)
> > [ 6.265552] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 65535B index= 1088 (slots=32768/ 992)
> > [ 6.265633] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 2128B index= 17464 (slots=32768/ 994)
> > [ 6.286142] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 65535B index= 1120 (slots=32768/ 992)
> > [ 6.286182] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 72B index= 17466 (slots=32768/ 993)
> > [ 7.574489] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 65535B index= 1152 (slots=32768/ 992)
> > [ 7.584645] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 60B index= 17467 (slots=32768/ 993)
> > [ 7.595593] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 65535B index= 1184 (slots=32768/ 992)
> > [ 7.595608] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 52B index= 8198 (slots=32768/ 993)
> > [ 7.605359] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 65535B index= 1216 (slots=32768/ 993)
> > [ 7.624703] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 452B index= 1248 (slots=32768/ 993)
> > [ 7.635603] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 65535B index= 1280 (slots=32768/ 992)
> > [ 7.645344] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 52B index= 1312 (slots=32768/ 993)
> > [ 7.656247] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 65535B index= 1314 (slots=32768/ 992)
> > [ 7.683567] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> > 65535B index= 1346 (slots=32768/ 992)
> > [ 7.696095] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single
> > size=1048583B index= -1 (slots=32768/ 992)
> >
> > I'm still trying to understand the swiotlb allocation to see if there
> > is some configuration change I should be making.
>
> I suspect you hit the same issue mentioned here:
>
> https://lore.kernel.org/all/CAOMZO5A7+nxACoBPY0k8cOpVQByZtEV_N1489MK5wETHF_RXWA@mail.gmail.com/
>
> so can you check if below commit present in your kernel, and if not could you pick it up
> and try again?
>
> commit 14cebf689a78 ("swiotlb: Reinstate page-alignment for mappings >= PAGE_SIZE")
>
I bisected the 'swiotlb buffer is full' issue back to commit
aaf244141ed7 ("wifi: ath11k: fix IOMMU errors on buffer rings") which
looks to me to be a legitimate fix and if I revert it swiotlb is now
happy and the driver registers but I get the crash on client connect
that I was seeing in 6.6 so that commit fixes an issue, but causes
swiotlb to not be fulfilled.
The issue seems to be that the swiotlb memory buffer allocator is
getting too fragmented to be useful with what ath11k is now asking for
(a lot of 2K and 64K buffers and then finally a 1048583B buffer which
fails due to the fragmentation of the swiotlb buffer.
I'm guessing that this has gone unnoticed for a while because there
are maybe not a lot of systems out there that require swiotlb with
ath11k (either no IOMMU or more memory than DMA can address) and my
guess is that if you test ath11k with swiotlb=force you will easily
see this 'swiotlb buffer is full' issue on other systems.
I'm not that knowledgeable about ath11k but I do know that ath10 and
ath12k do not have this issue with swiotlb. Debugging a bit shows that
there are a lot of large DMA buffers being requested by ath11k and I'm
wondering if that could be reduced or optimized somehow.
>
> >
> > To avoid using swiotlb is there some way to limit the memory region
> > used for DMA operations to below 32bit boundary yet still allow the
> > memory above 32bit to be useful in the system for userspace maybe?
> if you are using dma_alloc_coherent() I'm afraid there is no way for that. the API
> internally ignores any zone flags passed with the 'gfp' argument. see
>
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/dma/mapping.c#n615
>
is DMA_RESTRICTED_POOL a solution for me?
Best Regards,
Tim
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: ath11k swiotlb buffer is full (on IMX8M with 4GiB DRAM)
2024-11-23 0:43 ath11k swiotlb buffer is full (on IMX8M with 4GiB DRAM) Tim Harvey
@ 2024-11-25 7:23 ` Baochen Qiang
2024-11-25 18:02 ` Tim Harvey
0 siblings, 1 reply; 17+ messages in thread
From: Baochen Qiang @ 2024-11-25 7:23 UTC (permalink / raw)
To: Tim Harvey; +Cc: ath11k, linux-wireless, Fabio Estevam
On 11/23/2024 8:43 AM, Tim Harvey wrote:
> On Thu, Nov 21, 2024 at 9:51 PM Baochen Qiang <quic_bqiang@quicinc.com> wrote:
>>
>>
>>
>> On 11/22/2024 5:50 AM, Tim Harvey wrote:
>>> On Tue, Nov 19, 2024 at 6:32 PM Baochen Qiang <quic_bqiang@quicinc.com> wrote:
>>>>
>>>>
>>>>
>>>> On 11/20/2024 4:16 AM, Tim Harvey wrote:
>>>>> Greetings,
>>>>>
>>>>> I've got an ath11k card that is failing to init on an IMX8MM system
>>>>> with 4GB of DRAM:
>>>>> [ 7.551582] ath11k_pci 0000:01:00.0: BAR 0 [mem
>>>>> 0x18000000-0x181fffff 64bit]: assigned
>>>>> [ 7.551713] ath11k_pci 0000:01:00.0: enabling device (0000 -> 0002)
>>>>> [ 7.552401] ath11k_pci 0000:01:00.0: MSI vectors: 16
>>>>> [ 7.552440] ath11k_pci 0000:01:00.0: qcn9074 hw1.0
>>>>> [ 7.887186] mhi mhi0: Loaded FW: ath11k/QCN9074/hw1.0/amss.bin,
>>>>> sha256: 5ee1b7b204541b5f99984f21d694ececaec08fbce1b520ffe6fe740b02a4afd7
>>>>> [ 8.435964] ath11k_pci 0000:01:00.0: chip_id 0x0 chip_family 0x0
>>>>> board_id 0xff soc_id 0xffffffff
>>>>> [ 8.435991] ath11k_pci 0000:01:00.0: fw_version 0x270206d0
>>>>> fw_build_timestamp 2022-08-04 12:48 fw_build_id
>>>>> WLAN.HK.2.7.0.1-01744-QCAHKSWPL_SILICONZ-1
>>>>> [ 8.441700] ath11k_pci 0000:01:00.0: Loaded FW:
>>>>> ath11k/QCN9074/hw1.0/board-2.bin, sha256:
>>>>> dbf0ca14aa1229eccd48f26f1026901b9718b143bd30b51b8ea67c84ba6207f1
>>>>> [ 9.753764] ath11k_pci 0000:01:00.0: Loaded FW:
>>>>> ath11k/QCN9074/hw1.0/m3.bin, sha256:
>>>>> b6d957f335073a15a8de809398e1506f0200a08747eaf7189c843cf519ffc1de
>>>>> [ 9.789791] ath11k_pci 0000:01:00.0: swiotlb buffer is full (sz:
>>>>> 1048583 bytes), total 32768 (slots), used 2528 (slots)
>>>>> [ 9.789853] ath11k_pci 0000:01:00.0: failed to set up tcl_comp ring (0) :-12
>>>>> [ 9.790238] ath11k_pci 0000:01:00.0: failed to init DP: -12
>>>>> root@noble-venice:~# cat /proc/cmdline
>>>>> console=ttymxc1,115200 earlycon=ec_imx6q,0x30890000,115200
>>>>> root=PARTUUID=5cdde84f-01 rootwait net.ifnames=0 cma=196M
>>>>>
>>>>> The IMX8MM's DRAM base is at 1GB so anything above 3GB hits the 32bit
>>>>> address boundary. If I pass in a mem=3096M the device registers just
>>>>> fine.
>>>> yeah ... that parameter makes kernel alloc memory below 32bit boundary, thus swiotlb is not necessary.
>>>
>>> Hi Baochen,
>>>
>>> Yes, that makes sense as I step through the code. On IMX8M with DRAM
>>> 3GB or less dma_capable(...) is true so swiotlb bounce buffers are not
>>> needed.
>>>
>>>>
>>>>>
>>>>> I found this to be the case with modern kernels however I found
>>>>> differing behavior with older kernels:
>>>>> - 6.6 and 6.1 the device registers with 4GB DRAM but crashes on client connect
>>>>> - 5.15 devices registers with 4GB DRAM and appears to work just fine
>>>> are you using Linus' tree or the stable tree?
>>>>
>>>
>>> For 6.6 I tested stable.
>> can you try Linus's tree ? as I know the stable tree is possible to miss some important fix.
>>
>>>
>>> This likely has something to do with commit dbd73acb22d8 ("wifi:
>>> ath11k: enable 36 bit mask for stream DMA") but it would seem to me
>>> that patch was trying to avoid the entire 32bit DMA limitation. Maybe
>>> that patch sets the ath11k device DMA mask to 36 bits but maybe the
>>> IMX8M PCI DMA is only capable of 32bits?
>> that patch is making situation better, not worse. that said, it helps to avoid swiotlb in
>> ath11k DMA, rather than to get it involved.
>>
>
> Yes, that patch would be an improvement on systems capable of
> addressing 64bit memory but not on the IMX8M which is seemingly
> capable of only 32bit DMA over PCI.
>
>>>
>>>>>
>>>>> Could anyone explain what is going on here? Obviously there have been
>>>>> changes at some point to start using swiotlb which I believe was all
>>>>> about avoiding 32bit DMA limitations but I'm not clear how I should be
>>>>> configuring this for IMX8MM with 4GB DRAM. Maybe my kernel IOMMU
>>>>> configuration is incorrect somehow?
>>>> there are quite some options associated with IOMMU, not sure which one might be causing this. But basically you may check:
>>>>
>>>> CONFIG_IOMMU_IOVA
>>>> CONFIG_IOMMU_API
>>>> CONFIG_IOMMU_SUPPORT
>>>> CONFIG_IOMMU_DMA=y
>>>>
>>>
>>> These are enabled which I believe appropriate for IMX8M. If I want to
>>> utilize the full 4GB DRAM on IMX then I must use IOMMU and swiotlb
>>> which would mean a performance hit due to copying mem to/from bounce
>>> buffers not to mention the fact that I can't figure out how to
>>> configure the system to avoid the 'swiotlb swiotlb buffer is full'
>>> issue.
>
> My statement regarding needing an IOMMU above is wrong; apparently the
> IMX8M SoC's don't have an IOMMU but the fact I have it enabled in the
> kernel should be a don't-care. If I understand swiotlb correctly, if I
> did have an IOMMU then it would be used instead of swiotlb.
>
>>>
>>> Enabling CONFIG_SWIOTLB_DYNAMIC does not help nor does increasing the
>>> number of slots - it has something to do with the number/size of DMA
>>> buffers that ath11k is asking for:
>> yeah, ath11k asks for fixed size DMA buffer regardless of that config.
>>
>>> # dmesg | grep swiotlb_tbl_map_single
>>> [ 5.237731] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 65535B index= 16384 (slots=32768/ 32)
>>> [ 5.247519] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 65535B index= 16416 (slots=32768/ 64)
>>> [ 5.261794] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 65535B index= 16448 (slots=32768/ 96)
>>> [ 5.275114] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 65535B index= 16480 (slots=32768/ 128)
>>> [ 5.287757] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 65535B index= 16512 (slots=32768/ 160)
>>> [ 5.299688] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 65535B index= 16544 (slots=32768/ 192)
>>> [ 5.312482] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 65535B index= 16576 (slots=32768/ 224)
>>> [ 5.324493] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 65535B index= 16608 (slots=32768/ 256)
>>> [ 5.337001] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 65535B index= 16640 (slots=32768/ 288)
>>> [ 5.346754] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 65535B index= 16672 (slots=32768/ 320)
>>> [ 5.356571] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 65535B index= 16704 (slots=32768/ 352)
>>> [ 5.366372] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 65535B index= 16736 (slots=32768/ 384)
>>> [ 5.376164] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 65535B index= 16768 (slots=32768/ 416)
>>> [ 5.385944] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 65535B index= 16800 (slots=32768/ 448)
>>> [ 5.395712] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 65535B index= 16832 (slots=32768/ 480)
>>> [ 5.408270] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 65535B index= 16864 (slots=32768/ 512)
>>> [ 5.419768] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 65535B index= 16896 (slots=32768/ 544)
>>> [ 5.430966] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 65535B index= 16928 (slots=32768/ 576)
>>> [ 5.442368] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 65535B index= 16960 (slots=32768/ 608)
>>> [ 5.452422] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 65535B index= 16992 (slots=32768/ 640)
>>> [ 5.463507] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 65535B index= 17024 (slots=32768/ 672)
>>> [ 5.473536] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 65535B index= 17056 (slots=32768/ 704)
>>> [ 5.485661] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 65535B index= 17088 (slots=32768/ 736)
>>> [ 5.495404] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 65535B index= 17120 (slots=32768/ 768)
>>> [ 5.509626] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 65535B index= 17152 (slots=32768/ 800)
>>> [ 5.519353] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 65535B index= 17184 (slots=32768/ 832)
>>> [ 5.529077] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 65535B index= 17216 (slots=32768/ 864)
>>> [ 5.538799] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 65535B index= 17248 (slots=32768/ 896)
>>> [ 5.548517] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 65535B index= 17280 (slots=32768/ 928)
>>> [ 5.558238] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 65535B index= 17312 (slots=32768/ 960)
>>> [ 5.567965] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 65535B index= 17344 (slots=32768/ 992)
>>> [ 5.578943] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 65535B index= 0 (slots=32768/ 992)
>>> [ 5.578964] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 52B index= 8192 (slots=32768/ 993)
>>> [ 5.599793] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 65535B index= 32 (slots=32768/ 992)
>>> [ 5.599861] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 68B index= 8193 (slots=32768/ 993)
>>> [ 5.609589] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 65535B index= 64 (slots=32768/ 993)
>>> [ 5.628921] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 65535B index= 96 (slots=32768/ 992)
>>> [ 5.638703] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 68B index= 17376 (slots=32768/ 993)
>>> [ 5.649602] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 65535B index= 128 (slots=32768/ 992)
>>> [ 5.659389] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 65535B index= 160 (slots=32768/ 992)
>>> [ 5.674038] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 96B index= 17377 (slots=32768/ 993)
>>> [ 5.685016] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 65535B index= 192 (slots=32768/ 992)
>>> [ 5.694819] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 65535B index= 224 (slots=32768/ 992)
>>> [ 5.694831] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 52B index= 17378 (slots=32768/ 993)
>>> [ 5.714194] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 40B index= 17379 (slots=32768/ 994)
>>> [ 5.725089] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 65535B index= 256 (slots=32768/ 992)
>>> [ 5.753507] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 6224B index= 17380 (slots=32768/ 996)
>>> [ 5.764668] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 65535B index= 288 (slots=32768/ 992)
>>> [ 5.774456] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 65535B index= 320 (slots=32768/ 992)
>>> [ 5.774620] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 6224B index= 17384 (slots=32768/ 996)
>>> [ 5.795091] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 65535B index= 352 (slots=32768/ 992)
>>> [ 5.795241] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 6224B index= 17388 (slots=32768/ 996)
>>> [ 5.815724] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 65535B index= 384 (slots=32768/ 992)
>>> [ 5.815884] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 6224B index= 17392 (slots=32768/ 996)
>>> [ 5.836357] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 65535B index= 416 (slots=32768/ 992)
>>> [ 5.836368] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 52B index= 8194 (slots=32768/ 993)
>>> [ 5.855856] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 6224B index= 17396 (slots=32768/ 997)
>>> [ 5.866818] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 65535B index= 448 (slots=32768/ 992)
>>> [ 5.866978] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 6224B index= 17400 (slots=32768/ 996)
>>> [ 5.887451] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 65535B index= 480 (slots=32768/ 992)
>>> [ 5.897231] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 65535B index= 512 (slots=32768/ 992)
>>> [ 5.897389] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 6224B index= 17404 (slots=32768/ 996)
>>> [ 5.917866] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 65535B index= 544 (slots=32768/ 992)
>>> [ 5.918026] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 6224B index= 17408 (slots=32768/ 996)
>>> [ 5.938489] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 65535B index= 576 (slots=32768/ 992)
>>> [ 5.938642] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 6224B index= 17412 (slots=32768/ 996)
>>> [ 5.959121] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 65535B index= 608 (slots=32768/ 992)
>>> [ 5.959135] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 52B index= 8195 (slots=32768/ 993)
>>> [ 5.978619] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 6224B index= 17416 (slots=32768/ 997)
>>> [ 5.989588] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 65535B index= 640 (slots=32768/ 992)
>>> [ 5.989738] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 6224B index= 17420 (slots=32768/ 996)
>>> [ 6.010215] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 65535B index= 672 (slots=32768/ 992)
>>> [ 6.020001] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 65535B index= 704 (slots=32768/ 992)
>>> [ 6.020158] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 6224B index= 17424 (slots=32768/ 996)
>>> [ 6.040643] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 65535B index= 736 (slots=32768/ 992)
>>> [ 6.040798] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 6224B index= 17428 (slots=32768/ 996)
>>> [ 6.061287] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 65535B index= 768 (slots=32768/ 992)
>>> [ 6.061437] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 6224B index= 17432 (slots=32768/ 996)
>>> [ 6.081918] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 65535B index= 800 (slots=32768/ 992)
>>> [ 6.081929] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 52B index= 8196 (slots=32768/ 993)
>>> [ 6.101409] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 6224B index= 17436 (slots=32768/ 997)
>>> [ 6.112375] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 65535B index= 832 (slots=32768/ 992)
>>> [ 6.112528] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 6224B index= 17440 (slots=32768/ 996)
>>> [ 6.133004] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 65535B index= 864 (slots=32768/ 992)
>>> [ 6.142785] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 65535B index= 896 (slots=32768/ 992)
>>> [ 6.142949] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 6224B index= 17444 (slots=32768/ 996)
>>> [ 6.163426] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 65535B index= 928 (slots=32768/ 992)
>>> [ 6.163576] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 6224B index= 17448 (slots=32768/ 996)
>>> [ 6.184058] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 65535B index= 960 (slots=32768/ 992)
>>> [ 6.184208] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 6224B index= 17452 (slots=32768/ 996)
>>> [ 6.204691] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 65535B index= 992 (slots=32768/ 992)
>>> [ 6.204704] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 52B index= 8197 (slots=32768/ 993)
>>> [ 6.224183] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 6224B index= 17456 (slots=32768/ 997)
>>> [ 6.235148] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 65535B index= 1024 (slots=32768/ 992)
>>> [ 6.235308] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 6224B index= 17460 (slots=32768/ 996)
>>> [ 6.255777] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 65535B index= 1056 (slots=32768/ 992)
>>> [ 6.265552] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 65535B index= 1088 (slots=32768/ 992)
>>> [ 6.265633] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 2128B index= 17464 (slots=32768/ 994)
>>> [ 6.286142] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 65535B index= 1120 (slots=32768/ 992)
>>> [ 6.286182] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 72B index= 17466 (slots=32768/ 993)
>>> [ 7.574489] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 65535B index= 1152 (slots=32768/ 992)
>>> [ 7.584645] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 60B index= 17467 (slots=32768/ 993)
>>> [ 7.595593] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 65535B index= 1184 (slots=32768/ 992)
>>> [ 7.595608] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 52B index= 8198 (slots=32768/ 993)
>>> [ 7.605359] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 65535B index= 1216 (slots=32768/ 993)
>>> [ 7.624703] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 452B index= 1248 (slots=32768/ 993)
>>> [ 7.635603] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 65535B index= 1280 (slots=32768/ 992)
>>> [ 7.645344] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 52B index= 1312 (slots=32768/ 993)
>>> [ 7.656247] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 65535B index= 1314 (slots=32768/ 992)
>>> [ 7.683567] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>> 65535B index= 1346 (slots=32768/ 992)
>>> [ 7.696095] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single
>>> size=1048583B index= -1 (slots=32768/ 992)
>>>
>>> I'm still trying to understand the swiotlb allocation to see if there
>>> is some configuration change I should be making.
>>
>> I suspect you hit the same issue mentioned here:
>>
>> https://lore.kernel.org/all/CAOMZO5A7+nxACoBPY0k8cOpVQByZtEV_N1489MK5wETHF_RXWA@mail.gmail.com/
>>
>> so can you check if below commit present in your kernel, and if not could you pick it up
>> and try again?
>>
>> commit 14cebf689a78 ("swiotlb: Reinstate page-alignment for mappings >= PAGE_SIZE")
ignore this request since it should be no related to your issue :(
>>
>
> I bisected the 'swiotlb buffer is full' issue back to commit
> aaf244141ed7 ("wifi: ath11k: fix IOMMU errors on buffer rings") which
> looks to me to be a legitimate fix and if I revert it swiotlb is now
> happy and the driver registers but I get the crash on client connect
> that I was seeing in 6.6 so that commit fixes an issue, but causes
> swiotlb to not be fulfilled.
not really ... that commit is not the cause to your issue. you don;t see the 'swiotlb
full' error after revert it simply because dma_map_single() is NOT called then.
>
> The issue seems to be that the swiotlb memory buffer allocator is
> getting too fragmented to be useful with what ath11k is now asking for
> (a lot of 2K and 64K buffers and then finally a 1048583B buffer which
> fails due to the fragmentation of the swiotlb buffer.
no, the direct cause to 'swiotlb full' error is that kernel does not allow a swiotlb map
request larger than 256kb [1]:
'A single allocation from swiotlb is limited to IO_TLB_SIZE * IO_TLB_SEGSIZE bytes, which
is 256 KiB with current definitions'
while here ath11k is requesting a buffer of 1048583 bytes.
howevr the question is that why swiotlb is involved here: for streamed DMA operation
ath11k is capable of addressing 64GB memory (with 36bit DMA mask), in your case this
covers whole system memory. the most possible reason I can think of is that swiotlb is
forcebly enabled in your kernel (with swiotlb=force?) such that each DMA buffer would be
bounced by swiotlb regardless of its physical address.
[1] Documentation/core-api/swiotlb.rst
>
> I'm guessing that this has gone unnoticed for a while because there
> are maybe not a lot of systems out there that require swiotlb with
> ath11k (either no IOMMU or more memory than DMA can address) and my
> guess is that if you test ath11k with swiotlb=force you will easily
> see this 'swiotlb buffer is full' issue on other systems.
>
> I'm not that knowledgeable about ath11k but I do know that ath10 and
> ath12k do not have this issue with swiotlb. Debugging a bit shows that
> there are a lot of large DMA buffers being requested by ath11k and I'm
> wondering if that could be reduced or optimized somehow.
>
>>
>>>
>>> To avoid using swiotlb is there some way to limit the memory region
>>> used for DMA operations to below 32bit boundary yet still allow the
>>> memory above 32bit to be useful in the system for userspace maybe?
>> if you are using dma_alloc_coherent() I'm afraid there is no way for that. the API
>> internally ignores any zone flags passed with the 'gfp' argument. see
>>
>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/dma/mapping.c#n615
>>
>
> is DMA_RESTRICTED_POOL a solution for me?
i don;t think this help since this is used in coherent DMA?
>
> Best Regards,
>
> Tim
>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: ath11k swiotlb buffer is full (on IMX8M with 4GiB DRAM)
2024-11-25 7:23 ` Baochen Qiang
@ 2024-11-25 18:02 ` Tim Harvey
2024-11-26 2:46 ` Baochen Qiang
0 siblings, 1 reply; 17+ messages in thread
From: Tim Harvey @ 2024-11-25 18:02 UTC (permalink / raw)
To: Baochen Qiang; +Cc: ath11k, linux-wireless, Fabio Estevam
On Sun, Nov 24, 2024 at 11:23 PM Baochen Qiang <quic_bqiang@quicinc.com> wrote:
>
>
>
> On 11/23/2024 8:43 AM, Tim Harvey wrote:
> > On Thu, Nov 21, 2024 at 9:51 PM Baochen Qiang <quic_bqiang@quicinc.com> wrote:
> >>
> >>
> >>
> >> On 11/22/2024 5:50 AM, Tim Harvey wrote:
> >>> On Tue, Nov 19, 2024 at 6:32 PM Baochen Qiang <quic_bqiang@quicinc.com> wrote:
> >>>>
> >>>>
> >>>>
> >>>> On 11/20/2024 4:16 AM, Tim Harvey wrote:
> >>>>> Greetings,
> >>>>>
> >>>>> I've got an ath11k card that is failing to init on an IMX8MM system
> >>>>> with 4GB of DRAM:
> >>>>> [ 7.551582] ath11k_pci 0000:01:00.0: BAR 0 [mem
> >>>>> 0x18000000-0x181fffff 64bit]: assigned
> >>>>> [ 7.551713] ath11k_pci 0000:01:00.0: enabling device (0000 -> 0002)
> >>>>> [ 7.552401] ath11k_pci 0000:01:00.0: MSI vectors: 16
> >>>>> [ 7.552440] ath11k_pci 0000:01:00.0: qcn9074 hw1.0
> >>>>> [ 7.887186] mhi mhi0: Loaded FW: ath11k/QCN9074/hw1.0/amss.bin,
> >>>>> sha256: 5ee1b7b204541b5f99984f21d694ececaec08fbce1b520ffe6fe740b02a4afd7
> >>>>> [ 8.435964] ath11k_pci 0000:01:00.0: chip_id 0x0 chip_family 0x0
> >>>>> board_id 0xff soc_id 0xffffffff
> >>>>> [ 8.435991] ath11k_pci 0000:01:00.0: fw_version 0x270206d0
> >>>>> fw_build_timestamp 2022-08-04 12:48 fw_build_id
> >>>>> WLAN.HK.2.7.0.1-01744-QCAHKSWPL_SILICONZ-1
> >>>>> [ 8.441700] ath11k_pci 0000:01:00.0: Loaded FW:
> >>>>> ath11k/QCN9074/hw1.0/board-2.bin, sha256:
> >>>>> dbf0ca14aa1229eccd48f26f1026901b9718b143bd30b51b8ea67c84ba6207f1
> >>>>> [ 9.753764] ath11k_pci 0000:01:00.0: Loaded FW:
> >>>>> ath11k/QCN9074/hw1.0/m3.bin, sha256:
> >>>>> b6d957f335073a15a8de809398e1506f0200a08747eaf7189c843cf519ffc1de
> >>>>> [ 9.789791] ath11k_pci 0000:01:00.0: swiotlb buffer is full (sz:
> >>>>> 1048583 bytes), total 32768 (slots), used 2528 (slots)
> >>>>> [ 9.789853] ath11k_pci 0000:01:00.0: failed to set up tcl_comp ring (0) :-12
> >>>>> [ 9.790238] ath11k_pci 0000:01:00.0: failed to init DP: -12
> >>>>> root@noble-venice:~# cat /proc/cmdline
> >>>>> console=ttymxc1,115200 earlycon=ec_imx6q,0x30890000,115200
> >>>>> root=PARTUUID=5cdde84f-01 rootwait net.ifnames=0 cma=196M
> >>>>>
> >>>>> The IMX8MM's DRAM base is at 1GB so anything above 3GB hits the 32bit
> >>>>> address boundary. If I pass in a mem=3096M the device registers just
> >>>>> fine.
> >>>> yeah ... that parameter makes kernel alloc memory below 32bit boundary, thus swiotlb is not necessary.
> >>>
> >>> Hi Baochen,
> >>>
> >>> Yes, that makes sense as I step through the code. On IMX8M with DRAM
> >>> 3GB or less dma_capable(...) is true so swiotlb bounce buffers are not
> >>> needed.
> >>>
> >>>>
> >>>>>
> >>>>> I found this to be the case with modern kernels however I found
> >>>>> differing behavior with older kernels:
> >>>>> - 6.6 and 6.1 the device registers with 4GB DRAM but crashes on client connect
> >>>>> - 5.15 devices registers with 4GB DRAM and appears to work just fine
> >>>> are you using Linus' tree or the stable tree?
> >>>>
> >>>
> >>> For 6.6 I tested stable.
> >> can you try Linus's tree ? as I know the stable tree is possible to miss some important fix.
> >>
> >>>
> >>> This likely has something to do with commit dbd73acb22d8 ("wifi:
> >>> ath11k: enable 36 bit mask for stream DMA") but it would seem to me
> >>> that patch was trying to avoid the entire 32bit DMA limitation. Maybe
> >>> that patch sets the ath11k device DMA mask to 36 bits but maybe the
> >>> IMX8M PCI DMA is only capable of 32bits?
> >> that patch is making situation better, not worse. that said, it helps to avoid swiotlb in
> >> ath11k DMA, rather than to get it involved.
> >>
> >
> > Yes, that patch would be an improvement on systems capable of
> > addressing 64bit memory but not on the IMX8M which is seemingly
> > capable of only 32bit DMA over PCI.
> >
> >>>
> >>>>>
> >>>>> Could anyone explain what is going on here? Obviously there have been
> >>>>> changes at some point to start using swiotlb which I believe was all
> >>>>> about avoiding 32bit DMA limitations but I'm not clear how I should be
> >>>>> configuring this for IMX8MM with 4GB DRAM. Maybe my kernel IOMMU
> >>>>> configuration is incorrect somehow?
> >>>> there are quite some options associated with IOMMU, not sure which one might be causing this. But basically you may check:
> >>>>
> >>>> CONFIG_IOMMU_IOVA
> >>>> CONFIG_IOMMU_API
> >>>> CONFIG_IOMMU_SUPPORT
> >>>> CONFIG_IOMMU_DMA=y
> >>>>
> >>>
> >>> These are enabled which I believe appropriate for IMX8M. If I want to
> >>> utilize the full 4GB DRAM on IMX then I must use IOMMU and swiotlb
> >>> which would mean a performance hit due to copying mem to/from bounce
> >>> buffers not to mention the fact that I can't figure out how to
> >>> configure the system to avoid the 'swiotlb swiotlb buffer is full'
> >>> issue.
> >
> > My statement regarding needing an IOMMU above is wrong; apparently the
> > IMX8M SoC's don't have an IOMMU but the fact I have it enabled in the
> > kernel should be a don't-care. If I understand swiotlb correctly, if I
> > did have an IOMMU then it would be used instead of swiotlb.
> >
> >>>
> >>> Enabling CONFIG_SWIOTLB_DYNAMIC does not help nor does increasing the
> >>> number of slots - it has something to do with the number/size of DMA
> >>> buffers that ath11k is asking for:
> >> yeah, ath11k asks for fixed size DMA buffer regardless of that config.
> >>
> >>> # dmesg | grep swiotlb_tbl_map_single
> >>> [ 5.237731] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 65535B index= 16384 (slots=32768/ 32)
> >>> [ 5.247519] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 65535B index= 16416 (slots=32768/ 64)
> >>> [ 5.261794] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 65535B index= 16448 (slots=32768/ 96)
> >>> [ 5.275114] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 65535B index= 16480 (slots=32768/ 128)
> >>> [ 5.287757] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 65535B index= 16512 (slots=32768/ 160)
> >>> [ 5.299688] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 65535B index= 16544 (slots=32768/ 192)
> >>> [ 5.312482] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 65535B index= 16576 (slots=32768/ 224)
> >>> [ 5.324493] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 65535B index= 16608 (slots=32768/ 256)
> >>> [ 5.337001] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 65535B index= 16640 (slots=32768/ 288)
> >>> [ 5.346754] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 65535B index= 16672 (slots=32768/ 320)
> >>> [ 5.356571] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 65535B index= 16704 (slots=32768/ 352)
> >>> [ 5.366372] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 65535B index= 16736 (slots=32768/ 384)
> >>> [ 5.376164] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 65535B index= 16768 (slots=32768/ 416)
> >>> [ 5.385944] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 65535B index= 16800 (slots=32768/ 448)
> >>> [ 5.395712] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 65535B index= 16832 (slots=32768/ 480)
> >>> [ 5.408270] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 65535B index= 16864 (slots=32768/ 512)
> >>> [ 5.419768] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 65535B index= 16896 (slots=32768/ 544)
> >>> [ 5.430966] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 65535B index= 16928 (slots=32768/ 576)
> >>> [ 5.442368] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 65535B index= 16960 (slots=32768/ 608)
> >>> [ 5.452422] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 65535B index= 16992 (slots=32768/ 640)
> >>> [ 5.463507] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 65535B index= 17024 (slots=32768/ 672)
> >>> [ 5.473536] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 65535B index= 17056 (slots=32768/ 704)
> >>> [ 5.485661] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 65535B index= 17088 (slots=32768/ 736)
> >>> [ 5.495404] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 65535B index= 17120 (slots=32768/ 768)
> >>> [ 5.509626] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 65535B index= 17152 (slots=32768/ 800)
> >>> [ 5.519353] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 65535B index= 17184 (slots=32768/ 832)
> >>> [ 5.529077] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 65535B index= 17216 (slots=32768/ 864)
> >>> [ 5.538799] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 65535B index= 17248 (slots=32768/ 896)
> >>> [ 5.548517] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 65535B index= 17280 (slots=32768/ 928)
> >>> [ 5.558238] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 65535B index= 17312 (slots=32768/ 960)
> >>> [ 5.567965] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 65535B index= 17344 (slots=32768/ 992)
> >>> [ 5.578943] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 65535B index= 0 (slots=32768/ 992)
> >>> [ 5.578964] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 52B index= 8192 (slots=32768/ 993)
> >>> [ 5.599793] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 65535B index= 32 (slots=32768/ 992)
> >>> [ 5.599861] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 68B index= 8193 (slots=32768/ 993)
> >>> [ 5.609589] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 65535B index= 64 (slots=32768/ 993)
> >>> [ 5.628921] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 65535B index= 96 (slots=32768/ 992)
> >>> [ 5.638703] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 68B index= 17376 (slots=32768/ 993)
> >>> [ 5.649602] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 65535B index= 128 (slots=32768/ 992)
> >>> [ 5.659389] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 65535B index= 160 (slots=32768/ 992)
> >>> [ 5.674038] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 96B index= 17377 (slots=32768/ 993)
> >>> [ 5.685016] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 65535B index= 192 (slots=32768/ 992)
> >>> [ 5.694819] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 65535B index= 224 (slots=32768/ 992)
> >>> [ 5.694831] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 52B index= 17378 (slots=32768/ 993)
> >>> [ 5.714194] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 40B index= 17379 (slots=32768/ 994)
> >>> [ 5.725089] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 65535B index= 256 (slots=32768/ 992)
> >>> [ 5.753507] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 6224B index= 17380 (slots=32768/ 996)
> >>> [ 5.764668] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 65535B index= 288 (slots=32768/ 992)
> >>> [ 5.774456] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 65535B index= 320 (slots=32768/ 992)
> >>> [ 5.774620] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 6224B index= 17384 (slots=32768/ 996)
> >>> [ 5.795091] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 65535B index= 352 (slots=32768/ 992)
> >>> [ 5.795241] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 6224B index= 17388 (slots=32768/ 996)
> >>> [ 5.815724] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 65535B index= 384 (slots=32768/ 992)
> >>> [ 5.815884] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 6224B index= 17392 (slots=32768/ 996)
> >>> [ 5.836357] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 65535B index= 416 (slots=32768/ 992)
> >>> [ 5.836368] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 52B index= 8194 (slots=32768/ 993)
> >>> [ 5.855856] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 6224B index= 17396 (slots=32768/ 997)
> >>> [ 5.866818] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 65535B index= 448 (slots=32768/ 992)
> >>> [ 5.866978] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 6224B index= 17400 (slots=32768/ 996)
> >>> [ 5.887451] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 65535B index= 480 (slots=32768/ 992)
> >>> [ 5.897231] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 65535B index= 512 (slots=32768/ 992)
> >>> [ 5.897389] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 6224B index= 17404 (slots=32768/ 996)
> >>> [ 5.917866] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 65535B index= 544 (slots=32768/ 992)
> >>> [ 5.918026] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 6224B index= 17408 (slots=32768/ 996)
> >>> [ 5.938489] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 65535B index= 576 (slots=32768/ 992)
> >>> [ 5.938642] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 6224B index= 17412 (slots=32768/ 996)
> >>> [ 5.959121] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 65535B index= 608 (slots=32768/ 992)
> >>> [ 5.959135] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 52B index= 8195 (slots=32768/ 993)
> >>> [ 5.978619] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 6224B index= 17416 (slots=32768/ 997)
> >>> [ 5.989588] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 65535B index= 640 (slots=32768/ 992)
> >>> [ 5.989738] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 6224B index= 17420 (slots=32768/ 996)
> >>> [ 6.010215] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 65535B index= 672 (slots=32768/ 992)
> >>> [ 6.020001] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 65535B index= 704 (slots=32768/ 992)
> >>> [ 6.020158] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 6224B index= 17424 (slots=32768/ 996)
> >>> [ 6.040643] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 65535B index= 736 (slots=32768/ 992)
> >>> [ 6.040798] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 6224B index= 17428 (slots=32768/ 996)
> >>> [ 6.061287] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 65535B index= 768 (slots=32768/ 992)
> >>> [ 6.061437] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 6224B index= 17432 (slots=32768/ 996)
> >>> [ 6.081918] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 65535B index= 800 (slots=32768/ 992)
> >>> [ 6.081929] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 52B index= 8196 (slots=32768/ 993)
> >>> [ 6.101409] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 6224B index= 17436 (slots=32768/ 997)
> >>> [ 6.112375] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 65535B index= 832 (slots=32768/ 992)
> >>> [ 6.112528] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 6224B index= 17440 (slots=32768/ 996)
> >>> [ 6.133004] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 65535B index= 864 (slots=32768/ 992)
> >>> [ 6.142785] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 65535B index= 896 (slots=32768/ 992)
> >>> [ 6.142949] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 6224B index= 17444 (slots=32768/ 996)
> >>> [ 6.163426] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 65535B index= 928 (slots=32768/ 992)
> >>> [ 6.163576] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 6224B index= 17448 (slots=32768/ 996)
> >>> [ 6.184058] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 65535B index= 960 (slots=32768/ 992)
> >>> [ 6.184208] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 6224B index= 17452 (slots=32768/ 996)
> >>> [ 6.204691] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 65535B index= 992 (slots=32768/ 992)
> >>> [ 6.204704] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 52B index= 8197 (slots=32768/ 993)
> >>> [ 6.224183] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 6224B index= 17456 (slots=32768/ 997)
> >>> [ 6.235148] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 65535B index= 1024 (slots=32768/ 992)
> >>> [ 6.235308] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 6224B index= 17460 (slots=32768/ 996)
> >>> [ 6.255777] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 65535B index= 1056 (slots=32768/ 992)
> >>> [ 6.265552] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 65535B index= 1088 (slots=32768/ 992)
> >>> [ 6.265633] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 2128B index= 17464 (slots=32768/ 994)
> >>> [ 6.286142] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 65535B index= 1120 (slots=32768/ 992)
> >>> [ 6.286182] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 72B index= 17466 (slots=32768/ 993)
> >>> [ 7.574489] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 65535B index= 1152 (slots=32768/ 992)
> >>> [ 7.584645] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 60B index= 17467 (slots=32768/ 993)
> >>> [ 7.595593] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 65535B index= 1184 (slots=32768/ 992)
> >>> [ 7.595608] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 52B index= 8198 (slots=32768/ 993)
> >>> [ 7.605359] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 65535B index= 1216 (slots=32768/ 993)
> >>> [ 7.624703] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 452B index= 1248 (slots=32768/ 993)
> >>> [ 7.635603] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 65535B index= 1280 (slots=32768/ 992)
> >>> [ 7.645344] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 52B index= 1312 (slots=32768/ 993)
> >>> [ 7.656247] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 65535B index= 1314 (slots=32768/ 992)
> >>> [ 7.683567] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>> 65535B index= 1346 (slots=32768/ 992)
> >>> [ 7.696095] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single
> >>> size=1048583B index= -1 (slots=32768/ 992)
> >>>
> >>> I'm still trying to understand the swiotlb allocation to see if there
> >>> is some configuration change I should be making.
> >>
> >> I suspect you hit the same issue mentioned here:
> >>
> >> https://lore.kernel.org/all/CAOMZO5A7+nxACoBPY0k8cOpVQByZtEV_N1489MK5wETHF_RXWA@mail.gmail.com/
> >>
> >> so can you check if below commit present in your kernel, and if not could you pick it up
> >> and try again?
> >>
> >> commit 14cebf689a78 ("swiotlb: Reinstate page-alignment for mappings >= PAGE_SIZE")
> ignore this request since it should be no related to your issue :(
>
> >>
> >
> > I bisected the 'swiotlb buffer is full' issue back to commit
> > aaf244141ed7 ("wifi: ath11k: fix IOMMU errors on buffer rings") which
> > looks to me to be a legitimate fix and if I revert it swiotlb is now
> > happy and the driver registers but I get the crash on client connect
> > that I was seeing in 6.6 so that commit fixes an issue, but causes
> > swiotlb to not be fulfilled.
> not really ... that commit is not the cause to your issue. you don;t see the 'swiotlb
> full' error after revert it simply because dma_map_single() is NOT called then.
>
>
> >
> > The issue seems to be that the swiotlb memory buffer allocator is
> > getting too fragmented to be useful with what ath11k is now asking for
> > (a lot of 2K and 64K buffers and then finally a 1048583B buffer which
> > fails due to the fragmentation of the swiotlb buffer.
> no, the direct cause to 'swiotlb full' error is that kernel does not allow a swiotlb map
> request larger than 256kb [1]:
>
> 'A single allocation from swiotlb is limited to IO_TLB_SIZE * IO_TLB_SEGSIZE bytes, which
> is 256 KiB with current definitions'
>
> while here ath11k is requesting a buffer of 1048583 bytes.
>
>
> howevr the question is that why swiotlb is involved here: for streamed DMA operation
> ath11k is capable of addressing 64GB memory (with 36bit DMA mask), in your case this
> covers whole system memory. the most possible reason I can think of is that swiotlb is
> forcebly enabled in your kernel (with swiotlb=force?) such that each DMA buffer would be
> bounced by swiotlb regardless of its physical address.
>
I do not have swiotlb forced explicitly. Again, this is because I'm on
a IMX8MM with 4GiB DRAM which has no IOMMU and a 32bit DMA where
peripherals can not access memory over 3GiB as its base DRAM starting
at 1GiB (so swiotlb is getting used with a DRAM size >3GiB).
Reverting commit d0e2523bfa9c ("ath11k: allocate HAL_WBM2SW_RELEASE
ring from cacheable memory") indeed resolves this issue.
I notice that ath12k has a similar architecture as ath11k where
ath12k_dp_srng_setup() looks like what ath11k_dp_srng_setup() before
the change to allocate its buffers from cacheable memory so it's
probably just a matter of time before the same changes are made to
ath12k which will break that for this platform/memory-size as well.
So the way I see to resolve this either:
a) revert commit d0e2523bfa9c ("ath11k: allocate HAL_WBM2SW_RELEASE
ring from cacheable memory") - to stop asking for buffers >256KiB
b) find some other use of that upper 1GiB so that it can't be
allocated by DMA and swiotlb isn't needed
c) tell my board users to use mem=3096M and lose that last 1GiB of DRAM
>
>
> [1] Documentation/core-api/swiotlb.rst
>
> >
> > I'm guessing that this has gone unnoticed for a while because there
> > are maybe not a lot of systems out there that require swiotlb with
> > ath11k (either no IOMMU or more memory than DMA can address) and my
> > guess is that if you test ath11k with swiotlb=force you will easily
> > see this 'swiotlb buffer is full' issue on other systems.
> >
> > I'm not that knowledgeable about ath11k but I do know that ath10 and
> > ath12k do not have this issue with swiotlb. Debugging a bit shows that
> > there are a lot of large DMA buffers being requested by ath11k and I'm
> > wondering if that could be reduced or optimized somehow.
> >
> >>
> >>>
> >>> To avoid using swiotlb is there some way to limit the memory region
> >>> used for DMA operations to below 32bit boundary yet still allow the
> >>> memory above 32bit to be useful in the system for userspace maybe?
> >> if you are using dma_alloc_coherent() I'm afraid there is no way for that. the API
> >> internally ignores any zone flags passed with the 'gfp' argument. see
> >>
> >> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/dma/mapping.c#n615
> >>
> >
> > is DMA_RESTRICTED_POOL a solution for me?
> i don;t think this help since this is used in coherent DMA?
>
While DMA_RESTRICTED_POOL does allow defining the area used by swiotlb
it doesn't change the way swiotlb allocates buffers or the fact that
swiotlb is used at all.
Best Regards,
Tim
> >
> > Best Regards,
> >
> > Tim
> >
>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: ath11k swiotlb buffer is full (on IMX8M with 4GiB DRAM)
2024-11-25 18:02 ` Tim Harvey
@ 2024-11-26 2:46 ` Baochen Qiang
2024-12-06 17:07 ` Tim Harvey
0 siblings, 1 reply; 17+ messages in thread
From: Baochen Qiang @ 2024-11-26 2:46 UTC (permalink / raw)
To: Tim Harvey; +Cc: ath11k, linux-wireless, Fabio Estevam
On 11/26/2024 2:02 AM, Tim Harvey wrote:
> On Sun, Nov 24, 2024 at 11:23 PM Baochen Qiang <quic_bqiang@quicinc.com> wrote:
>>
>>
>>
>> On 11/23/2024 8:43 AM, Tim Harvey wrote:
>>> On Thu, Nov 21, 2024 at 9:51 PM Baochen Qiang <quic_bqiang@quicinc.com> wrote:
>>>>
>>>>
>>>>
>>>> On 11/22/2024 5:50 AM, Tim Harvey wrote:
>>>>> On Tue, Nov 19, 2024 at 6:32 PM Baochen Qiang <quic_bqiang@quicinc.com> wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 11/20/2024 4:16 AM, Tim Harvey wrote:
>>>>>>> Greetings,
>>>>>>>
>>>>>>> I've got an ath11k card that is failing to init on an IMX8MM system
>>>>>>> with 4GB of DRAM:
>>>>>>> [ 7.551582] ath11k_pci 0000:01:00.0: BAR 0 [mem
>>>>>>> 0x18000000-0x181fffff 64bit]: assigned
>>>>>>> [ 7.551713] ath11k_pci 0000:01:00.0: enabling device (0000 -> 0002)
>>>>>>> [ 7.552401] ath11k_pci 0000:01:00.0: MSI vectors: 16
>>>>>>> [ 7.552440] ath11k_pci 0000:01:00.0: qcn9074 hw1.0
>>>>>>> [ 7.887186] mhi mhi0: Loaded FW: ath11k/QCN9074/hw1.0/amss.bin,
>>>>>>> sha256: 5ee1b7b204541b5f99984f21d694ececaec08fbce1b520ffe6fe740b02a4afd7
>>>>>>> [ 8.435964] ath11k_pci 0000:01:00.0: chip_id 0x0 chip_family 0x0
>>>>>>> board_id 0xff soc_id 0xffffffff
>>>>>>> [ 8.435991] ath11k_pci 0000:01:00.0: fw_version 0x270206d0
>>>>>>> fw_build_timestamp 2022-08-04 12:48 fw_build_id
>>>>>>> WLAN.HK.2.7.0.1-01744-QCAHKSWPL_SILICONZ-1
>>>>>>> [ 8.441700] ath11k_pci 0000:01:00.0: Loaded FW:
>>>>>>> ath11k/QCN9074/hw1.0/board-2.bin, sha256:
>>>>>>> dbf0ca14aa1229eccd48f26f1026901b9718b143bd30b51b8ea67c84ba6207f1
>>>>>>> [ 9.753764] ath11k_pci 0000:01:00.0: Loaded FW:
>>>>>>> ath11k/QCN9074/hw1.0/m3.bin, sha256:
>>>>>>> b6d957f335073a15a8de809398e1506f0200a08747eaf7189c843cf519ffc1de
>>>>>>> [ 9.789791] ath11k_pci 0000:01:00.0: swiotlb buffer is full (sz:
>>>>>>> 1048583 bytes), total 32768 (slots), used 2528 (slots)
>>>>>>> [ 9.789853] ath11k_pci 0000:01:00.0: failed to set up tcl_comp ring (0) :-12
>>>>>>> [ 9.790238] ath11k_pci 0000:01:00.0: failed to init DP: -12
>>>>>>> root@noble-venice:~# cat /proc/cmdline
>>>>>>> console=ttymxc1,115200 earlycon=ec_imx6q,0x30890000,115200
>>>>>>> root=PARTUUID=5cdde84f-01 rootwait net.ifnames=0 cma=196M
>>>>>>>
>>>>>>> The IMX8MM's DRAM base is at 1GB so anything above 3GB hits the 32bit
>>>>>>> address boundary. If I pass in a mem=3096M the device registers just
>>>>>>> fine.
>>>>>> yeah ... that parameter makes kernel alloc memory below 32bit boundary, thus swiotlb is not necessary.
>>>>>
>>>>> Hi Baochen,
>>>>>
>>>>> Yes, that makes sense as I step through the code. On IMX8M with DRAM
>>>>> 3GB or less dma_capable(...) is true so swiotlb bounce buffers are not
>>>>> needed.
>>>>>
>>>>>>
>>>>>>>
>>>>>>> I found this to be the case with modern kernels however I found
>>>>>>> differing behavior with older kernels:
>>>>>>> - 6.6 and 6.1 the device registers with 4GB DRAM but crashes on client connect
>>>>>>> - 5.15 devices registers with 4GB DRAM and appears to work just fine
>>>>>> are you using Linus' tree or the stable tree?
>>>>>>
>>>>>
>>>>> For 6.6 I tested stable.
>>>> can you try Linus's tree ? as I know the stable tree is possible to miss some important fix.
>>>>
>>>>>
>>>>> This likely has something to do with commit dbd73acb22d8 ("wifi:
>>>>> ath11k: enable 36 bit mask for stream DMA") but it would seem to me
>>>>> that patch was trying to avoid the entire 32bit DMA limitation. Maybe
>>>>> that patch sets the ath11k device DMA mask to 36 bits but maybe the
>>>>> IMX8M PCI DMA is only capable of 32bits?
>>>> that patch is making situation better, not worse. that said, it helps to avoid swiotlb in
>>>> ath11k DMA, rather than to get it involved.
>>>>
>>>
>>> Yes, that patch would be an improvement on systems capable of
>>> addressing 64bit memory but not on the IMX8M which is seemingly
>>> capable of only 32bit DMA over PCI.
>>>
>>>>>
>>>>>>>
>>>>>>> Could anyone explain what is going on here? Obviously there have been
>>>>>>> changes at some point to start using swiotlb which I believe was all
>>>>>>> about avoiding 32bit DMA limitations but I'm not clear how I should be
>>>>>>> configuring this for IMX8MM with 4GB DRAM. Maybe my kernel IOMMU
>>>>>>> configuration is incorrect somehow?
>>>>>> there are quite some options associated with IOMMU, not sure which one might be causing this. But basically you may check:
>>>>>>
>>>>>> CONFIG_IOMMU_IOVA
>>>>>> CONFIG_IOMMU_API
>>>>>> CONFIG_IOMMU_SUPPORT
>>>>>> CONFIG_IOMMU_DMA=y
>>>>>>
>>>>>
>>>>> These are enabled which I believe appropriate for IMX8M. If I want to
>>>>> utilize the full 4GB DRAM on IMX then I must use IOMMU and swiotlb
>>>>> which would mean a performance hit due to copying mem to/from bounce
>>>>> buffers not to mention the fact that I can't figure out how to
>>>>> configure the system to avoid the 'swiotlb swiotlb buffer is full'
>>>>> issue.
>>>
>>> My statement regarding needing an IOMMU above is wrong; apparently the
>>> IMX8M SoC's don't have an IOMMU but the fact I have it enabled in the
>>> kernel should be a don't-care. If I understand swiotlb correctly, if I
>>> did have an IOMMU then it would be used instead of swiotlb.
>>>
>>>>>
>>>>> Enabling CONFIG_SWIOTLB_DYNAMIC does not help nor does increasing the
>>>>> number of slots - it has something to do with the number/size of DMA
>>>>> buffers that ath11k is asking for:
>>>> yeah, ath11k asks for fixed size DMA buffer regardless of that config.
>>>>
>>>>> # dmesg | grep swiotlb_tbl_map_single
>>>>> [ 5.237731] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 65535B index= 16384 (slots=32768/ 32)
>>>>> [ 5.247519] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 65535B index= 16416 (slots=32768/ 64)
>>>>> [ 5.261794] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 65535B index= 16448 (slots=32768/ 96)
>>>>> [ 5.275114] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 65535B index= 16480 (slots=32768/ 128)
>>>>> [ 5.287757] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 65535B index= 16512 (slots=32768/ 160)
>>>>> [ 5.299688] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 65535B index= 16544 (slots=32768/ 192)
>>>>> [ 5.312482] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 65535B index= 16576 (slots=32768/ 224)
>>>>> [ 5.324493] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 65535B index= 16608 (slots=32768/ 256)
>>>>> [ 5.337001] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 65535B index= 16640 (slots=32768/ 288)
>>>>> [ 5.346754] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 65535B index= 16672 (slots=32768/ 320)
>>>>> [ 5.356571] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 65535B index= 16704 (slots=32768/ 352)
>>>>> [ 5.366372] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 65535B index= 16736 (slots=32768/ 384)
>>>>> [ 5.376164] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 65535B index= 16768 (slots=32768/ 416)
>>>>> [ 5.385944] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 65535B index= 16800 (slots=32768/ 448)
>>>>> [ 5.395712] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 65535B index= 16832 (slots=32768/ 480)
>>>>> [ 5.408270] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 65535B index= 16864 (slots=32768/ 512)
>>>>> [ 5.419768] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 65535B index= 16896 (slots=32768/ 544)
>>>>> [ 5.430966] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 65535B index= 16928 (slots=32768/ 576)
>>>>> [ 5.442368] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 65535B index= 16960 (slots=32768/ 608)
>>>>> [ 5.452422] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 65535B index= 16992 (slots=32768/ 640)
>>>>> [ 5.463507] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 65535B index= 17024 (slots=32768/ 672)
>>>>> [ 5.473536] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 65535B index= 17056 (slots=32768/ 704)
>>>>> [ 5.485661] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 65535B index= 17088 (slots=32768/ 736)
>>>>> [ 5.495404] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 65535B index= 17120 (slots=32768/ 768)
>>>>> [ 5.509626] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 65535B index= 17152 (slots=32768/ 800)
>>>>> [ 5.519353] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 65535B index= 17184 (slots=32768/ 832)
>>>>> [ 5.529077] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 65535B index= 17216 (slots=32768/ 864)
>>>>> [ 5.538799] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 65535B index= 17248 (slots=32768/ 896)
>>>>> [ 5.548517] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 65535B index= 17280 (slots=32768/ 928)
>>>>> [ 5.558238] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 65535B index= 17312 (slots=32768/ 960)
>>>>> [ 5.567965] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 65535B index= 17344 (slots=32768/ 992)
>>>>> [ 5.578943] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 65535B index= 0 (slots=32768/ 992)
>>>>> [ 5.578964] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 52B index= 8192 (slots=32768/ 993)
>>>>> [ 5.599793] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 65535B index= 32 (slots=32768/ 992)
>>>>> [ 5.599861] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 68B index= 8193 (slots=32768/ 993)
>>>>> [ 5.609589] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 65535B index= 64 (slots=32768/ 993)
>>>>> [ 5.628921] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 65535B index= 96 (slots=32768/ 992)
>>>>> [ 5.638703] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 68B index= 17376 (slots=32768/ 993)
>>>>> [ 5.649602] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 65535B index= 128 (slots=32768/ 992)
>>>>> [ 5.659389] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 65535B index= 160 (slots=32768/ 992)
>>>>> [ 5.674038] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 96B index= 17377 (slots=32768/ 993)
>>>>> [ 5.685016] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 65535B index= 192 (slots=32768/ 992)
>>>>> [ 5.694819] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 65535B index= 224 (slots=32768/ 992)
>>>>> [ 5.694831] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 52B index= 17378 (slots=32768/ 993)
>>>>> [ 5.714194] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 40B index= 17379 (slots=32768/ 994)
>>>>> [ 5.725089] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 65535B index= 256 (slots=32768/ 992)
>>>>> [ 5.753507] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 6224B index= 17380 (slots=32768/ 996)
>>>>> [ 5.764668] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 65535B index= 288 (slots=32768/ 992)
>>>>> [ 5.774456] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 65535B index= 320 (slots=32768/ 992)
>>>>> [ 5.774620] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 6224B index= 17384 (slots=32768/ 996)
>>>>> [ 5.795091] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 65535B index= 352 (slots=32768/ 992)
>>>>> [ 5.795241] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 6224B index= 17388 (slots=32768/ 996)
>>>>> [ 5.815724] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 65535B index= 384 (slots=32768/ 992)
>>>>> [ 5.815884] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 6224B index= 17392 (slots=32768/ 996)
>>>>> [ 5.836357] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 65535B index= 416 (slots=32768/ 992)
>>>>> [ 5.836368] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 52B index= 8194 (slots=32768/ 993)
>>>>> [ 5.855856] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 6224B index= 17396 (slots=32768/ 997)
>>>>> [ 5.866818] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 65535B index= 448 (slots=32768/ 992)
>>>>> [ 5.866978] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 6224B index= 17400 (slots=32768/ 996)
>>>>> [ 5.887451] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 65535B index= 480 (slots=32768/ 992)
>>>>> [ 5.897231] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 65535B index= 512 (slots=32768/ 992)
>>>>> [ 5.897389] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 6224B index= 17404 (slots=32768/ 996)
>>>>> [ 5.917866] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 65535B index= 544 (slots=32768/ 992)
>>>>> [ 5.918026] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 6224B index= 17408 (slots=32768/ 996)
>>>>> [ 5.938489] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 65535B index= 576 (slots=32768/ 992)
>>>>> [ 5.938642] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 6224B index= 17412 (slots=32768/ 996)
>>>>> [ 5.959121] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 65535B index= 608 (slots=32768/ 992)
>>>>> [ 5.959135] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 52B index= 8195 (slots=32768/ 993)
>>>>> [ 5.978619] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 6224B index= 17416 (slots=32768/ 997)
>>>>> [ 5.989588] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 65535B index= 640 (slots=32768/ 992)
>>>>> [ 5.989738] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 6224B index= 17420 (slots=32768/ 996)
>>>>> [ 6.010215] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 65535B index= 672 (slots=32768/ 992)
>>>>> [ 6.020001] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 65535B index= 704 (slots=32768/ 992)
>>>>> [ 6.020158] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 6224B index= 17424 (slots=32768/ 996)
>>>>> [ 6.040643] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 65535B index= 736 (slots=32768/ 992)
>>>>> [ 6.040798] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 6224B index= 17428 (slots=32768/ 996)
>>>>> [ 6.061287] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 65535B index= 768 (slots=32768/ 992)
>>>>> [ 6.061437] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 6224B index= 17432 (slots=32768/ 996)
>>>>> [ 6.081918] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 65535B index= 800 (slots=32768/ 992)
>>>>> [ 6.081929] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 52B index= 8196 (slots=32768/ 993)
>>>>> [ 6.101409] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 6224B index= 17436 (slots=32768/ 997)
>>>>> [ 6.112375] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 65535B index= 832 (slots=32768/ 992)
>>>>> [ 6.112528] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 6224B index= 17440 (slots=32768/ 996)
>>>>> [ 6.133004] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 65535B index= 864 (slots=32768/ 992)
>>>>> [ 6.142785] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 65535B index= 896 (slots=32768/ 992)
>>>>> [ 6.142949] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 6224B index= 17444 (slots=32768/ 996)
>>>>> [ 6.163426] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 65535B index= 928 (slots=32768/ 992)
>>>>> [ 6.163576] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 6224B index= 17448 (slots=32768/ 996)
>>>>> [ 6.184058] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 65535B index= 960 (slots=32768/ 992)
>>>>> [ 6.184208] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 6224B index= 17452 (slots=32768/ 996)
>>>>> [ 6.204691] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 65535B index= 992 (slots=32768/ 992)
>>>>> [ 6.204704] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 52B index= 8197 (slots=32768/ 993)
>>>>> [ 6.224183] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 6224B index= 17456 (slots=32768/ 997)
>>>>> [ 6.235148] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 65535B index= 1024 (slots=32768/ 992)
>>>>> [ 6.235308] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 6224B index= 17460 (slots=32768/ 996)
>>>>> [ 6.255777] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 65535B index= 1056 (slots=32768/ 992)
>>>>> [ 6.265552] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 65535B index= 1088 (slots=32768/ 992)
>>>>> [ 6.265633] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 2128B index= 17464 (slots=32768/ 994)
>>>>> [ 6.286142] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 65535B index= 1120 (slots=32768/ 992)
>>>>> [ 6.286182] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 72B index= 17466 (slots=32768/ 993)
>>>>> [ 7.574489] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 65535B index= 1152 (slots=32768/ 992)
>>>>> [ 7.584645] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 60B index= 17467 (slots=32768/ 993)
>>>>> [ 7.595593] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 65535B index= 1184 (slots=32768/ 992)
>>>>> [ 7.595608] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 52B index= 8198 (slots=32768/ 993)
>>>>> [ 7.605359] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 65535B index= 1216 (slots=32768/ 993)
>>>>> [ 7.624703] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 452B index= 1248 (slots=32768/ 993)
>>>>> [ 7.635603] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 65535B index= 1280 (slots=32768/ 992)
>>>>> [ 7.645344] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 52B index= 1312 (slots=32768/ 993)
>>>>> [ 7.656247] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 65535B index= 1314 (slots=32768/ 992)
>>>>> [ 7.683567] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>> 65535B index= 1346 (slots=32768/ 992)
>>>>> [ 7.696095] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single
>>>>> size=1048583B index= -1 (slots=32768/ 992)
>>>>>
>>>>> I'm still trying to understand the swiotlb allocation to see if there
>>>>> is some configuration change I should be making.
>>>>
>>>> I suspect you hit the same issue mentioned here:
>>>>
>>>> https://lore.kernel.org/all/CAOMZO5A7+nxACoBPY0k8cOpVQByZtEV_N1489MK5wETHF_RXWA@mail.gmail.com/
>>>>
>>>> so can you check if below commit present in your kernel, and if not could you pick it up
>>>> and try again?
>>>>
>>>> commit 14cebf689a78 ("swiotlb: Reinstate page-alignment for mappings >= PAGE_SIZE")
>> ignore this request since it should be no related to your issue :(
>>
>>>>
>>>
>>> I bisected the 'swiotlb buffer is full' issue back to commit
>>> aaf244141ed7 ("wifi: ath11k: fix IOMMU errors on buffer rings") which
>>> looks to me to be a legitimate fix and if I revert it swiotlb is now
>>> happy and the driver registers but I get the crash on client connect
>>> that I was seeing in 6.6 so that commit fixes an issue, but causes
>>> swiotlb to not be fulfilled.
>> not really ... that commit is not the cause to your issue. you don;t see the 'swiotlb
>> full' error after revert it simply because dma_map_single() is NOT called then.
>>
>>
>>>
>>> The issue seems to be that the swiotlb memory buffer allocator is
>>> getting too fragmented to be useful with what ath11k is now asking for
>>> (a lot of 2K and 64K buffers and then finally a 1048583B buffer which
>>> fails due to the fragmentation of the swiotlb buffer.
>> no, the direct cause to 'swiotlb full' error is that kernel does not allow a swiotlb map
>> request larger than 256kb [1]:
>>
>> 'A single allocation from swiotlb is limited to IO_TLB_SIZE * IO_TLB_SEGSIZE bytes, which
>> is 256 KiB with current definitions'
>>
>> while here ath11k is requesting a buffer of 1048583 bytes.
>>
>>
>> howevr the question is that why swiotlb is involved here: for streamed DMA operation
>> ath11k is capable of addressing 64GB memory (with 36bit DMA mask), in your case this
>> covers whole system memory. the most possible reason I can think of is that swiotlb is
>> forcebly enabled in your kernel (with swiotlb=force?) such that each DMA buffer would be
>> bounced by swiotlb regardless of its physical address.
>>
>
> I do not have swiotlb forced explicitly. Again, this is because I'm on
> a IMX8MM with 4GiB DRAM which has no IOMMU and a 32bit DMA where
> peripherals can not access memory over 3GiB as its base DRAM starting
> at 1GiB (so swiotlb is getting used with a DRAM size >3GiB).
ah ... I get your point and agree. so the limitation doesn't come from the ath11k
hardware, but comes from IMX8MM itself. I guess the direct cause for involving swiotlb is
dma_capable() returns false due to dev->bus_dma_limit is ((1ULL << 32) - 1).
>
> Reverting commit d0e2523bfa9c ("ath11k: allocate HAL_WBM2SW_RELEASE
> ring from cacheable memory") indeed resolves this issue.
correct. by reverting it ath11k uses dma_alloc_coherent() instead of dma_map_single(), so
the issue is gone.
>
> I notice that ath12k has a similar architecture as ath11k where
> ath12k_dp_srng_setup() looks like what ath11k_dp_srng_setup() before
> the change to allocate its buffers from cacheable memory so it's
> probably just a matter of time before the same changes are made to
> ath12k which will break that for this platform/memory-size as well.
thanks, will take care.
>
> So the way I see to resolve this either:
> a) revert commit d0e2523bfa9c ("ath11k: allocate HAL_WBM2SW_RELEASE
> ring from cacheable memory") - to stop asking for buffers >256KiB
> b) find some other use of that upper 1GiB so that it can't be
> allocated by DMA and swiotlb isn't needed
> c) tell my board users to use mem=3096M and lose that last 1GiB of DRAM
while the first one seems best it impacts performance. so I get another proposal: in case
IOMMU not present, check DMA adressing limitation before allocating the buffer. If it can
not cover 36 bit memory space and the system is able to alloc buffers above 4Gb, pass
GFP_DMA32 or GFP_DMA to kzalloc() such that we can get a buffer below 4GB/16MB.
anyway, can u send a patch for that?
>
>>
>>
>> [1] Documentation/core-api/swiotlb.rst
>>
>>>
>>> I'm guessing that this has gone unnoticed for a while because there
>>> are maybe not a lot of systems out there that require swiotlb with
>>> ath11k (either no IOMMU or more memory than DMA can address) and my
>>> guess is that if you test ath11k with swiotlb=force you will easily
>>> see this 'swiotlb buffer is full' issue on other systems.
>>>
>>> I'm not that knowledgeable about ath11k but I do know that ath10 and
>>> ath12k do not have this issue with swiotlb. Debugging a bit shows that
>>> there are a lot of large DMA buffers being requested by ath11k and I'm
>>> wondering if that could be reduced or optimized somehow.
>>>
>>>>
>>>>>
>>>>> To avoid using swiotlb is there some way to limit the memory region
>>>>> used for DMA operations to below 32bit boundary yet still allow the
>>>>> memory above 32bit to be useful in the system for userspace maybe?
>>>> if you are using dma_alloc_coherent() I'm afraid there is no way for that. the API
>>>> internally ignores any zone flags passed with the 'gfp' argument. see
>>>>
>>>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/dma/mapping.c#n615
>>>>
>>>
>>> is DMA_RESTRICTED_POOL a solution for me?
>> i don;t think this help since this is used in coherent DMA?
>>
>
> While DMA_RESTRICTED_POOL does allow defining the area used by swiotlb
> it doesn't change the way swiotlb allocates buffers or the fact that
> swiotlb is used at all.
>
> Best Regards,
>
> Tim
>
>
>>>
>>> Best Regards,
>>>
>>> Tim
>>>
>>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: ath11k swiotlb buffer is full (on IMX8M with 4GiB DRAM)
2024-11-26 2:46 ` Baochen Qiang
@ 2024-12-06 17:07 ` Tim Harvey
2024-12-09 6:39 ` Baochen Qiang
2024-12-09 8:17 ` Christoph Hellwig
0 siblings, 2 replies; 17+ messages in thread
From: Tim Harvey @ 2024-12-06 17:07 UTC (permalink / raw)
To: Baochen Qiang
Cc: ath11k, linux-wireless, Fabio Estevam, Christoph Hellwig,
Marek Szyprowski, Robin Murphy, iommu
On Mon, Nov 25, 2024 at 6:47 PM Baochen Qiang <quic_bqiang@quicinc.com> wrote:
>
>
>
> On 11/26/2024 2:02 AM, Tim Harvey wrote:
> > On Sun, Nov 24, 2024 at 11:23 PM Baochen Qiang <quic_bqiang@quicinc.com> wrote:
> >>
> >>
> >>
> >> On 11/23/2024 8:43 AM, Tim Harvey wrote:
> >>> On Thu, Nov 21, 2024 at 9:51 PM Baochen Qiang <quic_bqiang@quicinc.com> wrote:
> >>>>
> >>>>
> >>>>
> >>>> On 11/22/2024 5:50 AM, Tim Harvey wrote:
> >>>>> On Tue, Nov 19, 2024 at 6:32 PM Baochen Qiang <quic_bqiang@quicinc.com> wrote:
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> On 11/20/2024 4:16 AM, Tim Harvey wrote:
> >>>>>>> Greetings,
> >>>>>>>
> >>>>>>> I've got an ath11k card that is failing to init on an IMX8MM system
> >>>>>>> with 4GB of DRAM:
> >>>>>>> [ 7.551582] ath11k_pci 0000:01:00.0: BAR 0 [mem
> >>>>>>> 0x18000000-0x181fffff 64bit]: assigned
> >>>>>>> [ 7.551713] ath11k_pci 0000:01:00.0: enabling device (0000 -> 0002)
> >>>>>>> [ 7.552401] ath11k_pci 0000:01:00.0: MSI vectors: 16
> >>>>>>> [ 7.552440] ath11k_pci 0000:01:00.0: qcn9074 hw1.0
> >>>>>>> [ 7.887186] mhi mhi0: Loaded FW: ath11k/QCN9074/hw1.0/amss.bin,
> >>>>>>> sha256: 5ee1b7b204541b5f99984f21d694ececaec08fbce1b520ffe6fe740b02a4afd7
> >>>>>>> [ 8.435964] ath11k_pci 0000:01:00.0: chip_id 0x0 chip_family 0x0
> >>>>>>> board_id 0xff soc_id 0xffffffff
> >>>>>>> [ 8.435991] ath11k_pci 0000:01:00.0: fw_version 0x270206d0
> >>>>>>> fw_build_timestamp 2022-08-04 12:48 fw_build_id
> >>>>>>> WLAN.HK.2.7.0.1-01744-QCAHKSWPL_SILICONZ-1
> >>>>>>> [ 8.441700] ath11k_pci 0000:01:00.0: Loaded FW:
> >>>>>>> ath11k/QCN9074/hw1.0/board-2.bin, sha256:
> >>>>>>> dbf0ca14aa1229eccd48f26f1026901b9718b143bd30b51b8ea67c84ba6207f1
> >>>>>>> [ 9.753764] ath11k_pci 0000:01:00.0: Loaded FW:
> >>>>>>> ath11k/QCN9074/hw1.0/m3.bin, sha256:
> >>>>>>> b6d957f335073a15a8de809398e1506f0200a08747eaf7189c843cf519ffc1de
> >>>>>>> [ 9.789791] ath11k_pci 0000:01:00.0: swiotlb buffer is full (sz:
> >>>>>>> 1048583 bytes), total 32768 (slots), used 2528 (slots)
> >>>>>>> [ 9.789853] ath11k_pci 0000:01:00.0: failed to set up tcl_comp ring (0) :-12
> >>>>>>> [ 9.790238] ath11k_pci 0000:01:00.0: failed to init DP: -12
> >>>>>>> root@noble-venice:~# cat /proc/cmdline
> >>>>>>> console=ttymxc1,115200 earlycon=ec_imx6q,0x30890000,115200
> >>>>>>> root=PARTUUID=5cdde84f-01 rootwait net.ifnames=0 cma=196M
> >>>>>>>
> >>>>>>> The IMX8MM's DRAM base is at 1GB so anything above 3GB hits the 32bit
> >>>>>>> address boundary. If I pass in a mem=3096M the device registers just
> >>>>>>> fine.
> >>>>>> yeah ... that parameter makes kernel alloc memory below 32bit boundary, thus swiotlb is not necessary.
> >>>>>
> >>>>> Hi Baochen,
> >>>>>
> >>>>> Yes, that makes sense as I step through the code. On IMX8M with DRAM
> >>>>> 3GB or less dma_capable(...) is true so swiotlb bounce buffers are not
> >>>>> needed.
> >>>>>
> >>>>>>
> >>>>>>>
> >>>>>>> I found this to be the case with modern kernels however I found
> >>>>>>> differing behavior with older kernels:
> >>>>>>> - 6.6 and 6.1 the device registers with 4GB DRAM but crashes on client connect
> >>>>>>> - 5.15 devices registers with 4GB DRAM and appears to work just fine
> >>>>>> are you using Linus' tree or the stable tree?
> >>>>>>
> >>>>>
> >>>>> For 6.6 I tested stable.
> >>>> can you try Linus's tree ? as I know the stable tree is possible to miss some important fix.
> >>>>
> >>>>>
> >>>>> This likely has something to do with commit dbd73acb22d8 ("wifi:
> >>>>> ath11k: enable 36 bit mask for stream DMA") but it would seem to me
> >>>>> that patch was trying to avoid the entire 32bit DMA limitation. Maybe
> >>>>> that patch sets the ath11k device DMA mask to 36 bits but maybe the
> >>>>> IMX8M PCI DMA is only capable of 32bits?
> >>>> that patch is making situation better, not worse. that said, it helps to avoid swiotlb in
> >>>> ath11k DMA, rather than to get it involved.
> >>>>
> >>>
> >>> Yes, that patch would be an improvement on systems capable of
> >>> addressing 64bit memory but not on the IMX8M which is seemingly
> >>> capable of only 32bit DMA over PCI.
> >>>
> >>>>>
> >>>>>>>
> >>>>>>> Could anyone explain what is going on here? Obviously there have been
> >>>>>>> changes at some point to start using swiotlb which I believe was all
> >>>>>>> about avoiding 32bit DMA limitations but I'm not clear how I should be
> >>>>>>> configuring this for IMX8MM with 4GB DRAM. Maybe my kernel IOMMU
> >>>>>>> configuration is incorrect somehow?
> >>>>>> there are quite some options associated with IOMMU, not sure which one might be causing this. But basically you may check:
> >>>>>>
> >>>>>> CONFIG_IOMMU_IOVA
> >>>>>> CONFIG_IOMMU_API
> >>>>>> CONFIG_IOMMU_SUPPORT
> >>>>>> CONFIG_IOMMU_DMA=y
> >>>>>>
> >>>>>
> >>>>> These are enabled which I believe appropriate for IMX8M. If I want to
> >>>>> utilize the full 4GB DRAM on IMX then I must use IOMMU and swiotlb
> >>>>> which would mean a performance hit due to copying mem to/from bounce
> >>>>> buffers not to mention the fact that I can't figure out how to
> >>>>> configure the system to avoid the 'swiotlb swiotlb buffer is full'
> >>>>> issue.
> >>>
> >>> My statement regarding needing an IOMMU above is wrong; apparently the
> >>> IMX8M SoC's don't have an IOMMU but the fact I have it enabled in the
> >>> kernel should be a don't-care. If I understand swiotlb correctly, if I
> >>> did have an IOMMU then it would be used instead of swiotlb.
> >>>
> >>>>>
> >>>>> Enabling CONFIG_SWIOTLB_DYNAMIC does not help nor does increasing the
> >>>>> number of slots - it has something to do with the number/size of DMA
> >>>>> buffers that ath11k is asking for:
> >>>> yeah, ath11k asks for fixed size DMA buffer regardless of that config.
> >>>>
> >>>>> # dmesg | grep swiotlb_tbl_map_single
> >>>>> [ 5.237731] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 65535B index= 16384 (slots=32768/ 32)
> >>>>> [ 5.247519] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 65535B index= 16416 (slots=32768/ 64)
> >>>>> [ 5.261794] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 65535B index= 16448 (slots=32768/ 96)
> >>>>> [ 5.275114] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 65535B index= 16480 (slots=32768/ 128)
> >>>>> [ 5.287757] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 65535B index= 16512 (slots=32768/ 160)
> >>>>> [ 5.299688] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 65535B index= 16544 (slots=32768/ 192)
> >>>>> [ 5.312482] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 65535B index= 16576 (slots=32768/ 224)
> >>>>> [ 5.324493] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 65535B index= 16608 (slots=32768/ 256)
> >>>>> [ 5.337001] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 65535B index= 16640 (slots=32768/ 288)
> >>>>> [ 5.346754] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 65535B index= 16672 (slots=32768/ 320)
> >>>>> [ 5.356571] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 65535B index= 16704 (slots=32768/ 352)
> >>>>> [ 5.366372] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 65535B index= 16736 (slots=32768/ 384)
> >>>>> [ 5.376164] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 65535B index= 16768 (slots=32768/ 416)
> >>>>> [ 5.385944] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 65535B index= 16800 (slots=32768/ 448)
> >>>>> [ 5.395712] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 65535B index= 16832 (slots=32768/ 480)
> >>>>> [ 5.408270] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 65535B index= 16864 (slots=32768/ 512)
> >>>>> [ 5.419768] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 65535B index= 16896 (slots=32768/ 544)
> >>>>> [ 5.430966] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 65535B index= 16928 (slots=32768/ 576)
> >>>>> [ 5.442368] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 65535B index= 16960 (slots=32768/ 608)
> >>>>> [ 5.452422] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 65535B index= 16992 (slots=32768/ 640)
> >>>>> [ 5.463507] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 65535B index= 17024 (slots=32768/ 672)
> >>>>> [ 5.473536] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 65535B index= 17056 (slots=32768/ 704)
> >>>>> [ 5.485661] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 65535B index= 17088 (slots=32768/ 736)
> >>>>> [ 5.495404] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 65535B index= 17120 (slots=32768/ 768)
> >>>>> [ 5.509626] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 65535B index= 17152 (slots=32768/ 800)
> >>>>> [ 5.519353] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 65535B index= 17184 (slots=32768/ 832)
> >>>>> [ 5.529077] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 65535B index= 17216 (slots=32768/ 864)
> >>>>> [ 5.538799] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 65535B index= 17248 (slots=32768/ 896)
> >>>>> [ 5.548517] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 65535B index= 17280 (slots=32768/ 928)
> >>>>> [ 5.558238] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 65535B index= 17312 (slots=32768/ 960)
> >>>>> [ 5.567965] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 65535B index= 17344 (slots=32768/ 992)
> >>>>> [ 5.578943] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 65535B index= 0 (slots=32768/ 992)
> >>>>> [ 5.578964] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 52B index= 8192 (slots=32768/ 993)
> >>>>> [ 5.599793] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 65535B index= 32 (slots=32768/ 992)
> >>>>> [ 5.599861] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 68B index= 8193 (slots=32768/ 993)
> >>>>> [ 5.609589] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 65535B index= 64 (slots=32768/ 993)
> >>>>> [ 5.628921] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 65535B index= 96 (slots=32768/ 992)
> >>>>> [ 5.638703] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 68B index= 17376 (slots=32768/ 993)
> >>>>> [ 5.649602] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 65535B index= 128 (slots=32768/ 992)
> >>>>> [ 5.659389] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 65535B index= 160 (slots=32768/ 992)
> >>>>> [ 5.674038] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 96B index= 17377 (slots=32768/ 993)
> >>>>> [ 5.685016] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 65535B index= 192 (slots=32768/ 992)
> >>>>> [ 5.694819] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 65535B index= 224 (slots=32768/ 992)
> >>>>> [ 5.694831] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 52B index= 17378 (slots=32768/ 993)
> >>>>> [ 5.714194] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 40B index= 17379 (slots=32768/ 994)
> >>>>> [ 5.725089] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 65535B index= 256 (slots=32768/ 992)
> >>>>> [ 5.753507] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 6224B index= 17380 (slots=32768/ 996)
> >>>>> [ 5.764668] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 65535B index= 288 (slots=32768/ 992)
> >>>>> [ 5.774456] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 65535B index= 320 (slots=32768/ 992)
> >>>>> [ 5.774620] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 6224B index= 17384 (slots=32768/ 996)
> >>>>> [ 5.795091] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 65535B index= 352 (slots=32768/ 992)
> >>>>> [ 5.795241] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 6224B index= 17388 (slots=32768/ 996)
> >>>>> [ 5.815724] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 65535B index= 384 (slots=32768/ 992)
> >>>>> [ 5.815884] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 6224B index= 17392 (slots=32768/ 996)
> >>>>> [ 5.836357] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 65535B index= 416 (slots=32768/ 992)
> >>>>> [ 5.836368] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 52B index= 8194 (slots=32768/ 993)
> >>>>> [ 5.855856] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 6224B index= 17396 (slots=32768/ 997)
> >>>>> [ 5.866818] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 65535B index= 448 (slots=32768/ 992)
> >>>>> [ 5.866978] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 6224B index= 17400 (slots=32768/ 996)
> >>>>> [ 5.887451] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 65535B index= 480 (slots=32768/ 992)
> >>>>> [ 5.897231] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 65535B index= 512 (slots=32768/ 992)
> >>>>> [ 5.897389] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 6224B index= 17404 (slots=32768/ 996)
> >>>>> [ 5.917866] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 65535B index= 544 (slots=32768/ 992)
> >>>>> [ 5.918026] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 6224B index= 17408 (slots=32768/ 996)
> >>>>> [ 5.938489] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 65535B index= 576 (slots=32768/ 992)
> >>>>> [ 5.938642] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 6224B index= 17412 (slots=32768/ 996)
> >>>>> [ 5.959121] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 65535B index= 608 (slots=32768/ 992)
> >>>>> [ 5.959135] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 52B index= 8195 (slots=32768/ 993)
> >>>>> [ 5.978619] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 6224B index= 17416 (slots=32768/ 997)
> >>>>> [ 5.989588] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 65535B index= 640 (slots=32768/ 992)
> >>>>> [ 5.989738] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 6224B index= 17420 (slots=32768/ 996)
> >>>>> [ 6.010215] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 65535B index= 672 (slots=32768/ 992)
> >>>>> [ 6.020001] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 65535B index= 704 (slots=32768/ 992)
> >>>>> [ 6.020158] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 6224B index= 17424 (slots=32768/ 996)
> >>>>> [ 6.040643] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 65535B index= 736 (slots=32768/ 992)
> >>>>> [ 6.040798] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 6224B index= 17428 (slots=32768/ 996)
> >>>>> [ 6.061287] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 65535B index= 768 (slots=32768/ 992)
> >>>>> [ 6.061437] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 6224B index= 17432 (slots=32768/ 996)
> >>>>> [ 6.081918] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 65535B index= 800 (slots=32768/ 992)
> >>>>> [ 6.081929] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 52B index= 8196 (slots=32768/ 993)
> >>>>> [ 6.101409] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 6224B index= 17436 (slots=32768/ 997)
> >>>>> [ 6.112375] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 65535B index= 832 (slots=32768/ 992)
> >>>>> [ 6.112528] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 6224B index= 17440 (slots=32768/ 996)
> >>>>> [ 6.133004] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 65535B index= 864 (slots=32768/ 992)
> >>>>> [ 6.142785] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 65535B index= 896 (slots=32768/ 992)
> >>>>> [ 6.142949] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 6224B index= 17444 (slots=32768/ 996)
> >>>>> [ 6.163426] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 65535B index= 928 (slots=32768/ 992)
> >>>>> [ 6.163576] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 6224B index= 17448 (slots=32768/ 996)
> >>>>> [ 6.184058] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 65535B index= 960 (slots=32768/ 992)
> >>>>> [ 6.184208] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 6224B index= 17452 (slots=32768/ 996)
> >>>>> [ 6.204691] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 65535B index= 992 (slots=32768/ 992)
> >>>>> [ 6.204704] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 52B index= 8197 (slots=32768/ 993)
> >>>>> [ 6.224183] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 6224B index= 17456 (slots=32768/ 997)
> >>>>> [ 6.235148] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 65535B index= 1024 (slots=32768/ 992)
> >>>>> [ 6.235308] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 6224B index= 17460 (slots=32768/ 996)
> >>>>> [ 6.255777] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 65535B index= 1056 (slots=32768/ 992)
> >>>>> [ 6.265552] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 65535B index= 1088 (slots=32768/ 992)
> >>>>> [ 6.265633] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 2128B index= 17464 (slots=32768/ 994)
> >>>>> [ 6.286142] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 65535B index= 1120 (slots=32768/ 992)
> >>>>> [ 6.286182] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 72B index= 17466 (slots=32768/ 993)
> >>>>> [ 7.574489] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 65535B index= 1152 (slots=32768/ 992)
> >>>>> [ 7.584645] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 60B index= 17467 (slots=32768/ 993)
> >>>>> [ 7.595593] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 65535B index= 1184 (slots=32768/ 992)
> >>>>> [ 7.595608] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 52B index= 8198 (slots=32768/ 993)
> >>>>> [ 7.605359] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 65535B index= 1216 (slots=32768/ 993)
> >>>>> [ 7.624703] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 452B index= 1248 (slots=32768/ 993)
> >>>>> [ 7.635603] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 65535B index= 1280 (slots=32768/ 992)
> >>>>> [ 7.645344] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 52B index= 1312 (slots=32768/ 993)
> >>>>> [ 7.656247] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 65535B index= 1314 (slots=32768/ 992)
> >>>>> [ 7.683567] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
> >>>>> 65535B index= 1346 (slots=32768/ 992)
> >>>>> [ 7.696095] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single
> >>>>> size=1048583B index= -1 (slots=32768/ 992)
> >>>>>
> >>>>> I'm still trying to understand the swiotlb allocation to see if there
> >>>>> is some configuration change I should be making.
> >>>>
> >>>> I suspect you hit the same issue mentioned here:
> >>>>
> >>>> https://lore.kernel.org/all/CAOMZO5A7+nxACoBPY0k8cOpVQByZtEV_N1489MK5wETHF_RXWA@mail.gmail.com/
> >>>>
> >>>> so can you check if below commit present in your kernel, and if not could you pick it up
> >>>> and try again?
> >>>>
> >>>> commit 14cebf689a78 ("swiotlb: Reinstate page-alignment for mappings >= PAGE_SIZE")
> >> ignore this request since it should be no related to your issue :(
> >>
> >>>>
> >>>
> >>> I bisected the 'swiotlb buffer is full' issue back to commit
> >>> aaf244141ed7 ("wifi: ath11k: fix IOMMU errors on buffer rings") which
> >>> looks to me to be a legitimate fix and if I revert it swiotlb is now
> >>> happy and the driver registers but I get the crash on client connect
> >>> that I was seeing in 6.6 so that commit fixes an issue, but causes
> >>> swiotlb to not be fulfilled.
> >> not really ... that commit is not the cause to your issue. you don;t see the 'swiotlb
> >> full' error after revert it simply because dma_map_single() is NOT called then.
> >>
> >>
> >>>
> >>> The issue seems to be that the swiotlb memory buffer allocator is
> >>> getting too fragmented to be useful with what ath11k is now asking for
> >>> (a lot of 2K and 64K buffers and then finally a 1048583B buffer which
> >>> fails due to the fragmentation of the swiotlb buffer.
> >> no, the direct cause to 'swiotlb full' error is that kernel does not allow a swiotlb map
> >> request larger than 256kb [1]:
> >>
> >> 'A single allocation from swiotlb is limited to IO_TLB_SIZE * IO_TLB_SEGSIZE bytes, which
> >> is 256 KiB with current definitions'
> >>
> >> while here ath11k is requesting a buffer of 1048583 bytes.
> >>
> >>
> >> howevr the question is that why swiotlb is involved here: for streamed DMA operation
> >> ath11k is capable of addressing 64GB memory (with 36bit DMA mask), in your case this
> >> covers whole system memory. the most possible reason I can think of is that swiotlb is
> >> forcebly enabled in your kernel (with swiotlb=force?) such that each DMA buffer would be
> >> bounced by swiotlb regardless of its physical address.
> >>
> >
> > I do not have swiotlb forced explicitly. Again, this is because I'm on
> > a IMX8MM with 4GiB DRAM which has no IOMMU and a 32bit DMA where
> > peripherals can not access memory over 3GiB as its base DRAM starting
> > at 1GiB (so swiotlb is getting used with a DRAM size >3GiB).
> ah ... I get your point and agree. so the limitation doesn't come from the ath11k
> hardware, but comes from IMX8MM itself. I guess the direct cause for involving swiotlb is
> dma_capable() returns false due to dev->bus_dma_limit is ((1ULL << 32) - 1).
>
> >
> > Reverting commit d0e2523bfa9c ("ath11k: allocate HAL_WBM2SW_RELEASE
> > ring from cacheable memory") indeed resolves this issue.
> correct. by reverting it ath11k uses dma_alloc_coherent() instead of dma_map_single(), so
> the issue is gone.
>
> >
> > I notice that ath12k has a similar architecture as ath11k where
> > ath12k_dp_srng_setup() looks like what ath11k_dp_srng_setup() before
> > the change to allocate its buffers from cacheable memory so it's
> > probably just a matter of time before the same changes are made to
> > ath12k which will break that for this platform/memory-size as well.
> thanks, will take care.
>
> >
> > So the way I see to resolve this either:
> > a) revert commit d0e2523bfa9c ("ath11k: allocate HAL_WBM2SW_RELEASE
> > ring from cacheable memory") - to stop asking for buffers >256KiB
> > b) find some other use of that upper 1GiB so that it can't be
> > allocated by DMA and swiotlb isn't needed
> > c) tell my board users to use mem=3096M and lose that last 1GiB of DRAM
>
> while the first one seems best it impacts performance. so I get another proposal: in case
> IOMMU not present, check DMA adressing limitation before allocating the buffer. If it can
> not cover 36 bit memory space and the system is able to alloc buffers above 4Gb, pass
> GFP_DMA32 or GFP_DMA to kzalloc() such that we can get a buffer below 4GB/16MB.
>
> anyway, can u send a patch for that?
>
I could work up a patch if I understood the memory allocation better.
Do you know how to check for this situation?
Are you saying allow the current kzalloc and then check the address
given to see if it's dma-able (how?) then free it and realloc it with
GFP_DMA32 and skip the dma_map_single?
I've added the iommu folk to the thread to see if they have any input.
To recap, the issue here is that ath11k wants to allocate some large
(~1MiB) cacheable buffers and on an iommu-less system (IMX8M) with a
32bit DMA engine this will fail as it requires swiotlb and the buffer
size being too large results in a swiotlb buffer full error.
Best Regards,
Tim
> >
> >>
> >>
> >> [1] Documentation/core-api/swiotlb.rst
> >>
> >>>
> >>> I'm guessing that this has gone unnoticed for a while because there
> >>> are maybe not a lot of systems out there that require swiotlb with
> >>> ath11k (either no IOMMU or more memory than DMA can address) and my
> >>> guess is that if you test ath11k with swiotlb=force you will easily
> >>> see this 'swiotlb buffer is full' issue on other systems.
> >>>
> >>> I'm not that knowledgeable about ath11k but I do know that ath10 and
> >>> ath12k do not have this issue with swiotlb. Debugging a bit shows that
> >>> there are a lot of large DMA buffers being requested by ath11k and I'm
> >>> wondering if that could be reduced or optimized somehow.
> >>>
> >>>>
> >>>>>
> >>>>> To avoid using swiotlb is there some way to limit the memory region
> >>>>> used for DMA operations to below 32bit boundary yet still allow the
> >>>>> memory above 32bit to be useful in the system for userspace maybe?
> >>>> if you are using dma_alloc_coherent() I'm afraid there is no way for that. the API
> >>>> internally ignores any zone flags passed with the 'gfp' argument. see
> >>>>
> >>>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/dma/mapping.c#n615
> >>>>
> >>>
> >>> is DMA_RESTRICTED_POOL a solution for me?
> >> i don;t think this help since this is used in coherent DMA?
> >>
> >
> > While DMA_RESTRICTED_POOL does allow defining the area used by swiotlb
> > it doesn't change the way swiotlb allocates buffers or the fact that
> > swiotlb is used at all.
> >
> > Best Regards,
> >
> > Tim
> >
> >
> >>>
> >>> Best Regards,
> >>>
> >>> Tim
> >>>
> >>
>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: ath11k swiotlb buffer is full (on IMX8M with 4GiB DRAM)
2024-12-06 17:07 ` Tim Harvey
@ 2024-12-09 6:39 ` Baochen Qiang
2024-12-09 8:17 ` Christoph Hellwig
1 sibling, 0 replies; 17+ messages in thread
From: Baochen Qiang @ 2024-12-09 6:39 UTC (permalink / raw)
To: Tim Harvey
Cc: ath11k, linux-wireless, Fabio Estevam, Christoph Hellwig,
Marek Szyprowski, Robin Murphy, iommu
On 12/7/2024 1:07 AM, Tim Harvey wrote:
> On Mon, Nov 25, 2024 at 6:47 PM Baochen Qiang <quic_bqiang@quicinc.com> wrote:
>>
>>
>>
>> On 11/26/2024 2:02 AM, Tim Harvey wrote:
>>> On Sun, Nov 24, 2024 at 11:23 PM Baochen Qiang <quic_bqiang@quicinc.com> wrote:
>>>>
>>>>
>>>>
>>>> On 11/23/2024 8:43 AM, Tim Harvey wrote:
>>>>> On Thu, Nov 21, 2024 at 9:51 PM Baochen Qiang <quic_bqiang@quicinc.com> wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 11/22/2024 5:50 AM, Tim Harvey wrote:
>>>>>>> On Tue, Nov 19, 2024 at 6:32 PM Baochen Qiang <quic_bqiang@quicinc.com> wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On 11/20/2024 4:16 AM, Tim Harvey wrote:
>>>>>>>>> Greetings,
>>>>>>>>>
>>>>>>>>> I've got an ath11k card that is failing to init on an IMX8MM system
>>>>>>>>> with 4GB of DRAM:
>>>>>>>>> [ 7.551582] ath11k_pci 0000:01:00.0: BAR 0 [mem
>>>>>>>>> 0x18000000-0x181fffff 64bit]: assigned
>>>>>>>>> [ 7.551713] ath11k_pci 0000:01:00.0: enabling device (0000 -> 0002)
>>>>>>>>> [ 7.552401] ath11k_pci 0000:01:00.0: MSI vectors: 16
>>>>>>>>> [ 7.552440] ath11k_pci 0000:01:00.0: qcn9074 hw1.0
>>>>>>>>> [ 7.887186] mhi mhi0: Loaded FW: ath11k/QCN9074/hw1.0/amss.bin,
>>>>>>>>> sha256: 5ee1b7b204541b5f99984f21d694ececaec08fbce1b520ffe6fe740b02a4afd7
>>>>>>>>> [ 8.435964] ath11k_pci 0000:01:00.0: chip_id 0x0 chip_family 0x0
>>>>>>>>> board_id 0xff soc_id 0xffffffff
>>>>>>>>> [ 8.435991] ath11k_pci 0000:01:00.0: fw_version 0x270206d0
>>>>>>>>> fw_build_timestamp 2022-08-04 12:48 fw_build_id
>>>>>>>>> WLAN.HK.2.7.0.1-01744-QCAHKSWPL_SILICONZ-1
>>>>>>>>> [ 8.441700] ath11k_pci 0000:01:00.0: Loaded FW:
>>>>>>>>> ath11k/QCN9074/hw1.0/board-2.bin, sha256:
>>>>>>>>> dbf0ca14aa1229eccd48f26f1026901b9718b143bd30b51b8ea67c84ba6207f1
>>>>>>>>> [ 9.753764] ath11k_pci 0000:01:00.0: Loaded FW:
>>>>>>>>> ath11k/QCN9074/hw1.0/m3.bin, sha256:
>>>>>>>>> b6d957f335073a15a8de809398e1506f0200a08747eaf7189c843cf519ffc1de
>>>>>>>>> [ 9.789791] ath11k_pci 0000:01:00.0: swiotlb buffer is full (sz:
>>>>>>>>> 1048583 bytes), total 32768 (slots), used 2528 (slots)
>>>>>>>>> [ 9.789853] ath11k_pci 0000:01:00.0: failed to set up tcl_comp ring (0) :-12
>>>>>>>>> [ 9.790238] ath11k_pci 0000:01:00.0: failed to init DP: -12
>>>>>>>>> root@noble-venice:~# cat /proc/cmdline
>>>>>>>>> console=ttymxc1,115200 earlycon=ec_imx6q,0x30890000,115200
>>>>>>>>> root=PARTUUID=5cdde84f-01 rootwait net.ifnames=0 cma=196M
>>>>>>>>>
>>>>>>>>> The IMX8MM's DRAM base is at 1GB so anything above 3GB hits the 32bit
>>>>>>>>> address boundary. If I pass in a mem=3096M the device registers just
>>>>>>>>> fine.
>>>>>>>> yeah ... that parameter makes kernel alloc memory below 32bit boundary, thus swiotlb is not necessary.
>>>>>>>
>>>>>>> Hi Baochen,
>>>>>>>
>>>>>>> Yes, that makes sense as I step through the code. On IMX8M with DRAM
>>>>>>> 3GB or less dma_capable(...) is true so swiotlb bounce buffers are not
>>>>>>> needed.
>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>> I found this to be the case with modern kernels however I found
>>>>>>>>> differing behavior with older kernels:
>>>>>>>>> - 6.6 and 6.1 the device registers with 4GB DRAM but crashes on client connect
>>>>>>>>> - 5.15 devices registers with 4GB DRAM and appears to work just fine
>>>>>>>> are you using Linus' tree or the stable tree?
>>>>>>>>
>>>>>>>
>>>>>>> For 6.6 I tested stable.
>>>>>> can you try Linus's tree ? as I know the stable tree is possible to miss some important fix.
>>>>>>
>>>>>>>
>>>>>>> This likely has something to do with commit dbd73acb22d8 ("wifi:
>>>>>>> ath11k: enable 36 bit mask for stream DMA") but it would seem to me
>>>>>>> that patch was trying to avoid the entire 32bit DMA limitation. Maybe
>>>>>>> that patch sets the ath11k device DMA mask to 36 bits but maybe the
>>>>>>> IMX8M PCI DMA is only capable of 32bits?
>>>>>> that patch is making situation better, not worse. that said, it helps to avoid swiotlb in
>>>>>> ath11k DMA, rather than to get it involved.
>>>>>>
>>>>>
>>>>> Yes, that patch would be an improvement on systems capable of
>>>>> addressing 64bit memory but not on the IMX8M which is seemingly
>>>>> capable of only 32bit DMA over PCI.
>>>>>
>>>>>>>
>>>>>>>>>
>>>>>>>>> Could anyone explain what is going on here? Obviously there have been
>>>>>>>>> changes at some point to start using swiotlb which I believe was all
>>>>>>>>> about avoiding 32bit DMA limitations but I'm not clear how I should be
>>>>>>>>> configuring this for IMX8MM with 4GB DRAM. Maybe my kernel IOMMU
>>>>>>>>> configuration is incorrect somehow?
>>>>>>>> there are quite some options associated with IOMMU, not sure which one might be causing this. But basically you may check:
>>>>>>>>
>>>>>>>> CONFIG_IOMMU_IOVA
>>>>>>>> CONFIG_IOMMU_API
>>>>>>>> CONFIG_IOMMU_SUPPORT
>>>>>>>> CONFIG_IOMMU_DMA=y
>>>>>>>>
>>>>>>>
>>>>>>> These are enabled which I believe appropriate for IMX8M. If I want to
>>>>>>> utilize the full 4GB DRAM on IMX then I must use IOMMU and swiotlb
>>>>>>> which would mean a performance hit due to copying mem to/from bounce
>>>>>>> buffers not to mention the fact that I can't figure out how to
>>>>>>> configure the system to avoid the 'swiotlb swiotlb buffer is full'
>>>>>>> issue.
>>>>>
>>>>> My statement regarding needing an IOMMU above is wrong; apparently the
>>>>> IMX8M SoC's don't have an IOMMU but the fact I have it enabled in the
>>>>> kernel should be a don't-care. If I understand swiotlb correctly, if I
>>>>> did have an IOMMU then it would be used instead of swiotlb.
>>>>>
>>>>>>>
>>>>>>> Enabling CONFIG_SWIOTLB_DYNAMIC does not help nor does increasing the
>>>>>>> number of slots - it has something to do with the number/size of DMA
>>>>>>> buffers that ath11k is asking for:
>>>>>> yeah, ath11k asks for fixed size DMA buffer regardless of that config.
>>>>>>
>>>>>>> # dmesg | grep swiotlb_tbl_map_single
>>>>>>> [ 5.237731] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 65535B index= 16384 (slots=32768/ 32)
>>>>>>> [ 5.247519] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 65535B index= 16416 (slots=32768/ 64)
>>>>>>> [ 5.261794] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 65535B index= 16448 (slots=32768/ 96)
>>>>>>> [ 5.275114] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 65535B index= 16480 (slots=32768/ 128)
>>>>>>> [ 5.287757] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 65535B index= 16512 (slots=32768/ 160)
>>>>>>> [ 5.299688] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 65535B index= 16544 (slots=32768/ 192)
>>>>>>> [ 5.312482] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 65535B index= 16576 (slots=32768/ 224)
>>>>>>> [ 5.324493] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 65535B index= 16608 (slots=32768/ 256)
>>>>>>> [ 5.337001] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 65535B index= 16640 (slots=32768/ 288)
>>>>>>> [ 5.346754] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 65535B index= 16672 (slots=32768/ 320)
>>>>>>> [ 5.356571] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 65535B index= 16704 (slots=32768/ 352)
>>>>>>> [ 5.366372] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 65535B index= 16736 (slots=32768/ 384)
>>>>>>> [ 5.376164] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 65535B index= 16768 (slots=32768/ 416)
>>>>>>> [ 5.385944] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 65535B index= 16800 (slots=32768/ 448)
>>>>>>> [ 5.395712] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 65535B index= 16832 (slots=32768/ 480)
>>>>>>> [ 5.408270] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 65535B index= 16864 (slots=32768/ 512)
>>>>>>> [ 5.419768] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 65535B index= 16896 (slots=32768/ 544)
>>>>>>> [ 5.430966] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 65535B index= 16928 (slots=32768/ 576)
>>>>>>> [ 5.442368] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 65535B index= 16960 (slots=32768/ 608)
>>>>>>> [ 5.452422] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 65535B index= 16992 (slots=32768/ 640)
>>>>>>> [ 5.463507] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 65535B index= 17024 (slots=32768/ 672)
>>>>>>> [ 5.473536] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 65535B index= 17056 (slots=32768/ 704)
>>>>>>> [ 5.485661] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 65535B index= 17088 (slots=32768/ 736)
>>>>>>> [ 5.495404] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 65535B index= 17120 (slots=32768/ 768)
>>>>>>> [ 5.509626] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 65535B index= 17152 (slots=32768/ 800)
>>>>>>> [ 5.519353] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 65535B index= 17184 (slots=32768/ 832)
>>>>>>> [ 5.529077] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 65535B index= 17216 (slots=32768/ 864)
>>>>>>> [ 5.538799] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 65535B index= 17248 (slots=32768/ 896)
>>>>>>> [ 5.548517] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 65535B index= 17280 (slots=32768/ 928)
>>>>>>> [ 5.558238] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 65535B index= 17312 (slots=32768/ 960)
>>>>>>> [ 5.567965] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 65535B index= 17344 (slots=32768/ 992)
>>>>>>> [ 5.578943] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 65535B index= 0 (slots=32768/ 992)
>>>>>>> [ 5.578964] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 52B index= 8192 (slots=32768/ 993)
>>>>>>> [ 5.599793] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 65535B index= 32 (slots=32768/ 992)
>>>>>>> [ 5.599861] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 68B index= 8193 (slots=32768/ 993)
>>>>>>> [ 5.609589] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 65535B index= 64 (slots=32768/ 993)
>>>>>>> [ 5.628921] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 65535B index= 96 (slots=32768/ 992)
>>>>>>> [ 5.638703] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 68B index= 17376 (slots=32768/ 993)
>>>>>>> [ 5.649602] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 65535B index= 128 (slots=32768/ 992)
>>>>>>> [ 5.659389] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 65535B index= 160 (slots=32768/ 992)
>>>>>>> [ 5.674038] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 96B index= 17377 (slots=32768/ 993)
>>>>>>> [ 5.685016] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 65535B index= 192 (slots=32768/ 992)
>>>>>>> [ 5.694819] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 65535B index= 224 (slots=32768/ 992)
>>>>>>> [ 5.694831] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 52B index= 17378 (slots=32768/ 993)
>>>>>>> [ 5.714194] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 40B index= 17379 (slots=32768/ 994)
>>>>>>> [ 5.725089] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 65535B index= 256 (slots=32768/ 992)
>>>>>>> [ 5.753507] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 6224B index= 17380 (slots=32768/ 996)
>>>>>>> [ 5.764668] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 65535B index= 288 (slots=32768/ 992)
>>>>>>> [ 5.774456] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 65535B index= 320 (slots=32768/ 992)
>>>>>>> [ 5.774620] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 6224B index= 17384 (slots=32768/ 996)
>>>>>>> [ 5.795091] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 65535B index= 352 (slots=32768/ 992)
>>>>>>> [ 5.795241] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 6224B index= 17388 (slots=32768/ 996)
>>>>>>> [ 5.815724] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 65535B index= 384 (slots=32768/ 992)
>>>>>>> [ 5.815884] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 6224B index= 17392 (slots=32768/ 996)
>>>>>>> [ 5.836357] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 65535B index= 416 (slots=32768/ 992)
>>>>>>> [ 5.836368] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 52B index= 8194 (slots=32768/ 993)
>>>>>>> [ 5.855856] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 6224B index= 17396 (slots=32768/ 997)
>>>>>>> [ 5.866818] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 65535B index= 448 (slots=32768/ 992)
>>>>>>> [ 5.866978] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 6224B index= 17400 (slots=32768/ 996)
>>>>>>> [ 5.887451] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 65535B index= 480 (slots=32768/ 992)
>>>>>>> [ 5.897231] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 65535B index= 512 (slots=32768/ 992)
>>>>>>> [ 5.897389] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 6224B index= 17404 (slots=32768/ 996)
>>>>>>> [ 5.917866] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 65535B index= 544 (slots=32768/ 992)
>>>>>>> [ 5.918026] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 6224B index= 17408 (slots=32768/ 996)
>>>>>>> [ 5.938489] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 65535B index= 576 (slots=32768/ 992)
>>>>>>> [ 5.938642] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 6224B index= 17412 (slots=32768/ 996)
>>>>>>> [ 5.959121] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 65535B index= 608 (slots=32768/ 992)
>>>>>>> [ 5.959135] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 52B index= 8195 (slots=32768/ 993)
>>>>>>> [ 5.978619] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 6224B index= 17416 (slots=32768/ 997)
>>>>>>> [ 5.989588] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 65535B index= 640 (slots=32768/ 992)
>>>>>>> [ 5.989738] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 6224B index= 17420 (slots=32768/ 996)
>>>>>>> [ 6.010215] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 65535B index= 672 (slots=32768/ 992)
>>>>>>> [ 6.020001] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 65535B index= 704 (slots=32768/ 992)
>>>>>>> [ 6.020158] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 6224B index= 17424 (slots=32768/ 996)
>>>>>>> [ 6.040643] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 65535B index= 736 (slots=32768/ 992)
>>>>>>> [ 6.040798] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 6224B index= 17428 (slots=32768/ 996)
>>>>>>> [ 6.061287] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 65535B index= 768 (slots=32768/ 992)
>>>>>>> [ 6.061437] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 6224B index= 17432 (slots=32768/ 996)
>>>>>>> [ 6.081918] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 65535B index= 800 (slots=32768/ 992)
>>>>>>> [ 6.081929] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 52B index= 8196 (slots=32768/ 993)
>>>>>>> [ 6.101409] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 6224B index= 17436 (slots=32768/ 997)
>>>>>>> [ 6.112375] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 65535B index= 832 (slots=32768/ 992)
>>>>>>> [ 6.112528] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 6224B index= 17440 (slots=32768/ 996)
>>>>>>> [ 6.133004] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 65535B index= 864 (slots=32768/ 992)
>>>>>>> [ 6.142785] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 65535B index= 896 (slots=32768/ 992)
>>>>>>> [ 6.142949] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 6224B index= 17444 (slots=32768/ 996)
>>>>>>> [ 6.163426] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 65535B index= 928 (slots=32768/ 992)
>>>>>>> [ 6.163576] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 6224B index= 17448 (slots=32768/ 996)
>>>>>>> [ 6.184058] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 65535B index= 960 (slots=32768/ 992)
>>>>>>> [ 6.184208] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 6224B index= 17452 (slots=32768/ 996)
>>>>>>> [ 6.204691] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 65535B index= 992 (slots=32768/ 992)
>>>>>>> [ 6.204704] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 52B index= 8197 (slots=32768/ 993)
>>>>>>> [ 6.224183] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 6224B index= 17456 (slots=32768/ 997)
>>>>>>> [ 6.235148] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 65535B index= 1024 (slots=32768/ 992)
>>>>>>> [ 6.235308] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 6224B index= 17460 (slots=32768/ 996)
>>>>>>> [ 6.255777] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 65535B index= 1056 (slots=32768/ 992)
>>>>>>> [ 6.265552] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 65535B index= 1088 (slots=32768/ 992)
>>>>>>> [ 6.265633] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 2128B index= 17464 (slots=32768/ 994)
>>>>>>> [ 6.286142] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 65535B index= 1120 (slots=32768/ 992)
>>>>>>> [ 6.286182] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 72B index= 17466 (slots=32768/ 993)
>>>>>>> [ 7.574489] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 65535B index= 1152 (slots=32768/ 992)
>>>>>>> [ 7.584645] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 60B index= 17467 (slots=32768/ 993)
>>>>>>> [ 7.595593] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 65535B index= 1184 (slots=32768/ 992)
>>>>>>> [ 7.595608] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 52B index= 8198 (slots=32768/ 993)
>>>>>>> [ 7.605359] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 65535B index= 1216 (slots=32768/ 993)
>>>>>>> [ 7.624703] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 452B index= 1248 (slots=32768/ 993)
>>>>>>> [ 7.635603] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 65535B index= 1280 (slots=32768/ 992)
>>>>>>> [ 7.645344] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 52B index= 1312 (slots=32768/ 993)
>>>>>>> [ 7.656247] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 65535B index= 1314 (slots=32768/ 992)
>>>>>>> [ 7.683567] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single size=
>>>>>>> 65535B index= 1346 (slots=32768/ 992)
>>>>>>> [ 7.696095] ath11k_pci 0000:01:00.0: swiotlb_tbl_map_single
>>>>>>> size=1048583B index= -1 (slots=32768/ 992)
>>>>>>>
>>>>>>> I'm still trying to understand the swiotlb allocation to see if there
>>>>>>> is some configuration change I should be making.
>>>>>>
>>>>>> I suspect you hit the same issue mentioned here:
>>>>>>
>>>>>> https://lore.kernel.org/all/CAOMZO5A7+nxACoBPY0k8cOpVQByZtEV_N1489MK5wETHF_RXWA@mail.gmail.com/
>>>>>>
>>>>>> so can you check if below commit present in your kernel, and if not could you pick it up
>>>>>> and try again?
>>>>>>
>>>>>> commit 14cebf689a78 ("swiotlb: Reinstate page-alignment for mappings >= PAGE_SIZE")
>>>> ignore this request since it should be no related to your issue :(
>>>>
>>>>>>
>>>>>
>>>>> I bisected the 'swiotlb buffer is full' issue back to commit
>>>>> aaf244141ed7 ("wifi: ath11k: fix IOMMU errors on buffer rings") which
>>>>> looks to me to be a legitimate fix and if I revert it swiotlb is now
>>>>> happy and the driver registers but I get the crash on client connect
>>>>> that I was seeing in 6.6 so that commit fixes an issue, but causes
>>>>> swiotlb to not be fulfilled.
>>>> not really ... that commit is not the cause to your issue. you don;t see the 'swiotlb
>>>> full' error after revert it simply because dma_map_single() is NOT called then.
>>>>
>>>>
>>>>>
>>>>> The issue seems to be that the swiotlb memory buffer allocator is
>>>>> getting too fragmented to be useful with what ath11k is now asking for
>>>>> (a lot of 2K and 64K buffers and then finally a 1048583B buffer which
>>>>> fails due to the fragmentation of the swiotlb buffer.
>>>> no, the direct cause to 'swiotlb full' error is that kernel does not allow a swiotlb map
>>>> request larger than 256kb [1]:
>>>>
>>>> 'A single allocation from swiotlb is limited to IO_TLB_SIZE * IO_TLB_SEGSIZE bytes, which
>>>> is 256 KiB with current definitions'
>>>>
>>>> while here ath11k is requesting a buffer of 1048583 bytes.
>>>>
>>>>
>>>> howevr the question is that why swiotlb is involved here: for streamed DMA operation
>>>> ath11k is capable of addressing 64GB memory (with 36bit DMA mask), in your case this
>>>> covers whole system memory. the most possible reason I can think of is that swiotlb is
>>>> forcebly enabled in your kernel (with swiotlb=force?) such that each DMA buffer would be
>>>> bounced by swiotlb regardless of its physical address.
>>>>
>>>
>>> I do not have swiotlb forced explicitly. Again, this is because I'm on
>>> a IMX8MM with 4GiB DRAM which has no IOMMU and a 32bit DMA where
>>> peripherals can not access memory over 3GiB as its base DRAM starting
>>> at 1GiB (so swiotlb is getting used with a DRAM size >3GiB).
>> ah ... I get your point and agree. so the limitation doesn't come from the ath11k
>> hardware, but comes from IMX8MM itself. I guess the direct cause for involving swiotlb is
>> dma_capable() returns false due to dev->bus_dma_limit is ((1ULL << 32) - 1).
>>
>>>
>>> Reverting commit d0e2523bfa9c ("ath11k: allocate HAL_WBM2SW_RELEASE
>>> ring from cacheable memory") indeed resolves this issue.
>> correct. by reverting it ath11k uses dma_alloc_coherent() instead of dma_map_single(), so
>> the issue is gone.
>>
>>>
>>> I notice that ath12k has a similar architecture as ath11k where
>>> ath12k_dp_srng_setup() looks like what ath11k_dp_srng_setup() before
>>> the change to allocate its buffers from cacheable memory so it's
>>> probably just a matter of time before the same changes are made to
>>> ath12k which will break that for this platform/memory-size as well.
>> thanks, will take care.
>>
>>>
>>> So the way I see to resolve this either:
>>> a) revert commit d0e2523bfa9c ("ath11k: allocate HAL_WBM2SW_RELEASE
>>> ring from cacheable memory") - to stop asking for buffers >256KiB
>>> b) find some other use of that upper 1GiB so that it can't be
>>> allocated by DMA and swiotlb isn't needed
>>> c) tell my board users to use mem=3096M and lose that last 1GiB of DRAM
>>
>> while the first one seems best it impacts performance. so I get another proposal: in case
>> IOMMU not present, check DMA adressing limitation before allocating the buffer. If it can
>> not cover 36 bit memory space and the system is able to alloc buffers above 4Gb, pass
>> GFP_DMA32 or GFP_DMA to kzalloc() such that we can get a buffer below 4GB/16MB.
>>
>> anyway, can u send a patch for that?
>>
>
> I could work up a patch if I understood the memory allocation better.
> Do you know how to check for this situation?
>
> Are you saying allow the current kzalloc and then check the address
> given to see if it's dma-able (how?) then free it and realloc it with
> GFP_DMA32 and skip the dma_map_single?
yeah, somthing like that. But thinking it more I realize that it is not a general
sollution, even it can fix the issue in your case.
For a general sollution, what I can think of now is either of below:
1. revert commit d0e2523bfa9c ("ath11k: allocate HAL_WBM2SW_RELEASE ring from cacheable
memory"). This way we can avoid the size limitaion on streaming DMA.
2. limit the allocation size to be less than 256KB.
However both of them may have impact on performance.
>
> I've added the iommu folk to the thread to see if they have any input.
> To recap, the issue here is that ath11k wants to allocate some large
> (~1MiB) cacheable buffers and on an iommu-less system (IMX8M) with a
> 32bit DMA engine this will fail as it requires swiotlb and the buffer
> size being too large results in a swiotlb buffer full error.
>
> Best Regards,
>
> Tim
>
>>>
>>>>
>>>>
>>>> [1] Documentation/core-api/swiotlb.rst
>>>>
>>>>>
>>>>> I'm guessing that this has gone unnoticed for a while because there
>>>>> are maybe not a lot of systems out there that require swiotlb with
>>>>> ath11k (either no IOMMU or more memory than DMA can address) and my
>>>>> guess is that if you test ath11k with swiotlb=force you will easily
>>>>> see this 'swiotlb buffer is full' issue on other systems.
>>>>>
>>>>> I'm not that knowledgeable about ath11k but I do know that ath10 and
>>>>> ath12k do not have this issue with swiotlb. Debugging a bit shows that
>>>>> there are a lot of large DMA buffers being requested by ath11k and I'm
>>>>> wondering if that could be reduced or optimized somehow.
>>>>>
>>>>>>
>>>>>>>
>>>>>>> To avoid using swiotlb is there some way to limit the memory region
>>>>>>> used for DMA operations to below 32bit boundary yet still allow the
>>>>>>> memory above 32bit to be useful in the system for userspace maybe?
>>>>>> if you are using dma_alloc_coherent() I'm afraid there is no way for that. the API
>>>>>> internally ignores any zone flags passed with the 'gfp' argument. see
>>>>>>
>>>>>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/dma/mapping.c#n615
>>>>>>
>>>>>
>>>>> is DMA_RESTRICTED_POOL a solution for me?
>>>> i don;t think this help since this is used in coherent DMA?
>>>>
>>>
>>> While DMA_RESTRICTED_POOL does allow defining the area used by swiotlb
>>> it doesn't change the way swiotlb allocates buffers or the fact that
>>> swiotlb is used at all.
>>>
>>> Best Regards,
>>>
>>> Tim
>>>
>>>
>>>>>
>>>>> Best Regards,
>>>>>
>>>>> Tim
>>>>>
>>>>
>>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: ath11k swiotlb buffer is full (on IMX8M with 4GiB DRAM)
2024-12-06 17:07 ` Tim Harvey
2024-12-09 6:39 ` Baochen Qiang
@ 2024-12-09 8:17 ` Christoph Hellwig
2024-12-09 10:49 ` Robin Murphy
1 sibling, 1 reply; 17+ messages in thread
From: Christoph Hellwig @ 2024-12-09 8:17 UTC (permalink / raw)
To: Tim Harvey
Cc: Baochen Qiang, ath11k, linux-wireless, Fabio Estevam,
Christoph Hellwig, Marek Szyprowski, Robin Murphy, iommu
I scrolled three pages before giving up as it was just quotes over
quotas. Can you please write an email that contains whatever you're
trying to tell instead of just quotes? Same for the person replying.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: ath11k swiotlb buffer is full (on IMX8M with 4GiB DRAM)
2024-12-09 8:17 ` Christoph Hellwig
@ 2024-12-09 10:49 ` Robin Murphy
2024-12-09 19:15 ` Tim Harvey
0 siblings, 1 reply; 17+ messages in thread
From: Robin Murphy @ 2024-12-09 10:49 UTC (permalink / raw)
To: Christoph Hellwig, Tim Harvey
Cc: Baochen Qiang, ath11k, linux-wireless, Fabio Estevam,
Marek Szyprowski, iommu
On 09/12/2024 8:17 am, Christoph Hellwig wrote:
> I scrolled three pages before giving up as it was just quotes over
> quotas. Can you please write an email that contains whatever you're
> trying to tell instead of just quotes? Same for the person replying.
TBH I'm hesitant to look too closely since everything those Atheros WiFi
drivers do with DMA tends to be sketchy, but from what I could make out
from skimming until I also gave up, I think it might be an attempt to
reinvent dma_alloc_pages(), or possibly dma_alloc_noncoherent().
Cheers,
Robin.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: ath11k swiotlb buffer is full (on IMX8M with 4GiB DRAM)
2024-12-09 10:49 ` Robin Murphy
@ 2024-12-09 19:15 ` Tim Harvey
2024-12-10 4:11 ` Christoph Hellwig
0 siblings, 1 reply; 17+ messages in thread
From: Tim Harvey @ 2024-12-09 19:15 UTC (permalink / raw)
To: Robin Murphy, Christoph Hellwig
Cc: Baochen Qiang, ath11k, linux-wireless, Fabio Estevam,
Marek Szyprowski, iommu, P Praneesh
On Mon, Dec 9, 2024 at 2:49 AM Robin Murphy <robin.murphy@arm.com> wrote:
>
> On 09/12/2024 8:17 am, Christoph Hellwig wrote:
> > I scrolled three pages before giving up as it was just quotes over
> > quotas. Can you please write an email that contains whatever you're
> > trying to tell instead of just quotes? Same for the person replying.
>
Christoph,
Understood; there was a lot of back and forth and likely some
misinformation from my early replies. Let me recap here.
The issue I run into is that with Linux 6.9 and beyond on an IMX8M
Mini SoC (no IMOMMU) with >3GiB DRAM (which requires more than 32 bits
of address due to IMX8M's DRAM base being at 0x40000000) the ath11k
driver will fail to register a netdev and errors out with 'ath11k
swiotlb buffer is full':
[ 8.057077] ath11k_pci 0000:04:00.0: BAR 0 [mem
0x18200000-0x183fffff 64bit]: assigned
[ 8.057151] ath11k_pci 0000:04:00.0: enabling device (0000 -> 0002)
[ 8.091920] ath11k_pci 0000:04:00.0: MSI vectors: 16
[ 8.091960] ath11k_pci 0000:04:00.0: qcn9074 hw1.0
[ 8.832924] ath11k_pci 0000:04:00.0: chip_id 0x0 chip_family 0x0
board_id 0xff soc_id 0xffffffff
[ 8.832951] ath11k_pci 0000:04:00.0: fw_version 0x270206d0
fw_build_timestamp 2022-08-04 12:48 fw_build_id
WLAN.HK.2.7.0.1-01744-QCAHKSWPL_SILICONZ-1
[ 10.194343] ath11k_pci 0000:04:00.0: swiotlb buffer is full (sz:
1048583 bytes), total 32768 (slots), used 2529 (slots)
[ 10.194406] ath11k_pci 0000:04:00.0: failed to set up tcl_comp ring (0) :-12
[ 10.194781] ath11k_pci 0000:04:00.0: failed to init DP: -12
After a lot of back and forth and investigation this is due to the
IMX8M SoC's not having an IOMMU thus swiotlb is being used and ath11k
is requesting some buffers that are too large for swiotlb to provide.
There is a specific patch which added the HAL_WBM2SW_RELEASE buffers
to cacheable memory that could be reverted to fix this but the concern
was that it would impact performance moving those buffers to
non-cacheable memory (there are three ~1MiB buffers being allocated):
commit d0e2523bfa9cb ("ath11k: allocate HAL_WBM2SW_RELEASE ring from
cacheable memory").
> TBH I'm hesitant to look too closely since everything those Atheros WiFi
> drivers do with DMA tends to be sketchy, but from what I could make out
> from skimming until I also gave up, I think it might be an attempt to
> reinvent dma_alloc_pages(), or possibly dma_alloc_noncoherent().
Robin,
Agreed - I'm not sure how much attention, review, or testing these
ath11k patches originally got due to the fact that there appears to be
breakage for a couple of years here.
The chain of events as best I can tell are:
commit 6452f0a3d565 ("ath11k: allocate dst ring descriptors from
cacheable memory")
- Nov 12 2021 (made it into Linux 5.17)
- changes allocation of reo_dst rings to cacheable memory to allow
cached descriptor access to optimize CPU usage
- this is flawed because it uses virt_to_phys() to allocate cacheable
memory which does not work on systems with an IOMMU enabled or using
software IOMMU (swiotlb); this causes a kernel crash on client
association
commit d0e2523bfa9c ("ath11k: allocate HAL_WBM2SW_RELEASE ring from
cacheable memory)
- Nov 12 2021 (made it into Linux Linux 5.17)
- furthers the previous patch by also including the WBM2SW buffers in
cacheable memory which are about 1MiB in size
commit aaf244141ed7 ("wifi: ath11k: fix IOMMU errors on buffer rings")
- Dec 20 2023 (made it into Linux 6.9)
- resolves the issue from commit 6452f0a3d565 (but missing a Fixes
tag) by changing the virt_to_phys calls to dma_map_single() but on
systems that need software IOMMUC (IMX8MM > 3GiB) this exposes the
'swiotlb buffer is full' limitation due to commit d0e2523bfa9c which
allocates buffers exceeding the 256KiB limit imposed by swiotlb
Therefore in the case of an IOMMU'less system with DMA address
limitations of 32bit and >3GiB DRAM (as many IMX8M boards have) the
ath11k driver has been broken since 5.17.
Best Regards,
Tim
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: ath11k swiotlb buffer is full (on IMX8M with 4GiB DRAM)
2024-12-09 19:15 ` Tim Harvey
@ 2024-12-10 4:11 ` Christoph Hellwig
2024-12-10 23:06 ` Tim Harvey
0 siblings, 1 reply; 17+ messages in thread
From: Christoph Hellwig @ 2024-12-10 4:11 UTC (permalink / raw)
To: Tim Harvey
Cc: Robin Murphy, Christoph Hellwig, Baochen Qiang, ath11k,
linux-wireless, Fabio Estevam, Marek Szyprowski, iommu,
P Praneesh
On Mon, Dec 09, 2024 at 11:15:02AM -0800, Tim Harvey wrote:
> After a lot of back and forth and investigation this is due to the
> IMX8M SoC's not having an IOMMU thus swiotlb is being used and ath11k
> is requesting some buffers that are too large for swiotlb to provide.
> There is a specific patch which added the HAL_WBM2SW_RELEASE buffers
> to cacheable memory that could be reverted to fix this but the concern
> was that it would impact performance moving those buffers to
> non-cacheable memory (there are three ~1MiB buffers being allocated):
> commit d0e2523bfa9cb ("ath11k: allocate HAL_WBM2SW_RELEASE ring from
> cacheable memory").
The combination of "buffers" and "swiotlb" sounds like Robin was right
below.
> The chain of events as best I can tell are:
>
> commit 6452f0a3d565 ("ath11k: allocate dst ring descriptors from
> cacheable memory")
> - Nov 12 2021 (made it into Linux 5.17)
> - changes allocation of reo_dst rings to cacheable memory to allow
> cached descriptor access to optimize CPU usage
> - this is flawed because it uses virt_to_phys() to allocate cacheable
> memory which does not work on systems with an IOMMU enabled or using
> software IOMMU (swiotlb); this causes a kernel crash on client
> association
And this is where it started to take a wrong turn, that everyhing
later basically made worse. If you have long living and potentially
large DMA allocations, you need to use dma_alloc_* interfaces.
5.17 already had dma_alloc_pages for quite a while which was and still is
the proper interface to use. For much older kernel you'd be stuck
with dma_alloc_noncoherent or dma_alloc_attrs with the right flag,
but even that would have been much better.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: ath11k swiotlb buffer is full (on IMX8M with 4GiB DRAM)
2024-12-10 4:11 ` Christoph Hellwig
@ 2024-12-10 23:06 ` Tim Harvey
2024-12-11 2:31 ` Baochen Qiang
0 siblings, 1 reply; 17+ messages in thread
From: Tim Harvey @ 2024-12-10 23:06 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Robin Murphy, Baochen Qiang, ath11k, linux-wireless,
Fabio Estevam, Marek Szyprowski, iommu
On Mon, Dec 9, 2024 at 8:11 PM Christoph Hellwig <hch@lst.de> wrote:
>
> On Mon, Dec 09, 2024 at 11:15:02AM -0800, Tim Harvey wrote:
> > After a lot of back and forth and investigation this is due to the
> > IMX8M SoC's not having an IOMMU thus swiotlb is being used and ath11k
> > is requesting some buffers that are too large for swiotlb to provide.
> > There is a specific patch which added the HAL_WBM2SW_RELEASE buffers
> > to cacheable memory that could be reverted to fix this but the concern
> > was that it would impact performance moving those buffers to
> > non-cacheable memory (there are three ~1MiB buffers being allocated):
> > commit d0e2523bfa9cb ("ath11k: allocate HAL_WBM2SW_RELEASE ring from
> > cacheable memory").
>
> The combination of "buffers" and "swiotlb" sounds like Robin was right
> below.
>
> > The chain of events as best I can tell are:
> >
> > commit 6452f0a3d565 ("ath11k: allocate dst ring descriptors from
> > cacheable memory")
> > - Nov 12 2021 (made it into Linux 5.17)
> > - changes allocation of reo_dst rings to cacheable memory to allow
> > cached descriptor access to optimize CPU usage
> > - this is flawed because it uses virt_to_phys() to allocate cacheable
> > memory which does not work on systems with an IOMMU enabled or using
> > software IOMMU (swiotlb); this causes a kernel crash on client
> > association
>
> And this is where it started to take a wrong turn, that everyhing
> later basically made worse. If you have long living and potentially
> large DMA allocations, you need to use dma_alloc_* interfaces.
>
> 5.17 already had dma_alloc_pages for quite a while which was and still is
> the proper interface to use. For much older kernel you'd be stuck
> with dma_alloc_noncoherent or dma_alloc_attrs with the right flag,
> but even that would have been much better.
Christoph,
I'm not clear what you are suggesting be done here. Are you suggesting
that ath11k has been using the wrong mechanism by calling
dma_map_single for cached DMA buffers? I'm not all that familiar with
ath11k so I can't tell what buffers are considered long living.
Best Regards,
Tim
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: ath11k swiotlb buffer is full (on IMX8M with 4GiB DRAM)
2024-12-10 23:06 ` Tim Harvey
@ 2024-12-11 2:31 ` Baochen Qiang
2024-12-11 13:44 ` Robin Murphy
0 siblings, 1 reply; 17+ messages in thread
From: Baochen Qiang @ 2024-12-11 2:31 UTC (permalink / raw)
To: Tim Harvey, Christoph Hellwig
Cc: Robin Murphy, ath11k, linux-wireless, Fabio Estevam,
Marek Szyprowski, iommu
On 12/11/2024 7:06 AM, Tim Harvey wrote:
> On Mon, Dec 9, 2024 at 8:11 PM Christoph Hellwig <hch@lst.de> wrote:
>>
>> On Mon, Dec 09, 2024 at 11:15:02AM -0800, Tim Harvey wrote:
>>> After a lot of back and forth and investigation this is due to the
>>> IMX8M SoC's not having an IOMMU thus swiotlb is being used and ath11k
>>> is requesting some buffers that are too large for swiotlb to provide.
>>> There is a specific patch which added the HAL_WBM2SW_RELEASE buffers
>>> to cacheable memory that could be reverted to fix this but the concern
>>> was that it would impact performance moving those buffers to
>>> non-cacheable memory (there are three ~1MiB buffers being allocated):
>>> commit d0e2523bfa9cb ("ath11k: allocate HAL_WBM2SW_RELEASE ring from
>>> cacheable memory").
>>
>> The combination of "buffers" and "swiotlb" sounds like Robin was right
>> below.
>>
>>> The chain of events as best I can tell are:
>>>
>>> commit 6452f0a3d565 ("ath11k: allocate dst ring descriptors from
>>> cacheable memory")
>>> - Nov 12 2021 (made it into Linux 5.17)
>>> - changes allocation of reo_dst rings to cacheable memory to allow
>>> cached descriptor access to optimize CPU usage
>>> - this is flawed because it uses virt_to_phys() to allocate cacheable
>>> memory which does not work on systems with an IOMMU enabled or using
>>> software IOMMU (swiotlb); this causes a kernel crash on client
>>> association
>>
>> And this is where it started to take a wrong turn, that everyhing
>> later basically made worse. If you have long living and potentially
>> large DMA allocations, you need to use dma_alloc_* interfaces.
>>
>> 5.17 already had dma_alloc_pages for quite a while which was and still is
>> the proper interface to use. For much older kernel you'd be stuck
>> with dma_alloc_noncoherent or dma_alloc_attrs with the right flag,
>> but even that would have been much better.
>
> Christoph,
>
> I'm not clear what you are suggesting be done here. Are you suggesting
> that ath11k has been using the wrong mechanism by calling
> dma_map_single for cached DMA buffers? I'm not all that familiar with
> ath11k so I can't tell what buffers are considered long living.
those buffers are allocated when driver load and freed when driver unload, so IMO they are
long living.
>
> Best Regards,
>
> Tim
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: ath11k swiotlb buffer is full (on IMX8M with 4GiB DRAM)
2024-12-11 2:31 ` Baochen Qiang
@ 2024-12-11 13:44 ` Robin Murphy
2024-12-12 5:24 ` Baochen Qiang
0 siblings, 1 reply; 17+ messages in thread
From: Robin Murphy @ 2024-12-11 13:44 UTC (permalink / raw)
To: Baochen Qiang, Tim Harvey, Christoph Hellwig
Cc: ath11k, linux-wireless, Fabio Estevam, Marek Szyprowski, iommu
On 2024-12-11 2:31 am, Baochen Qiang wrote:
>
>
> On 12/11/2024 7:06 AM, Tim Harvey wrote:
>> On Mon, Dec 9, 2024 at 8:11 PM Christoph Hellwig <hch@lst.de> wrote:
>>>
>>> On Mon, Dec 09, 2024 at 11:15:02AM -0800, Tim Harvey wrote:
>>>> After a lot of back and forth and investigation this is due to the
>>>> IMX8M SoC's not having an IOMMU thus swiotlb is being used and ath11k
>>>> is requesting some buffers that are too large for swiotlb to provide.
>>>> There is a specific patch which added the HAL_WBM2SW_RELEASE buffers
>>>> to cacheable memory that could be reverted to fix this but the concern
>>>> was that it would impact performance moving those buffers to
>>>> non-cacheable memory (there are three ~1MiB buffers being allocated):
>>>> commit d0e2523bfa9cb ("ath11k: allocate HAL_WBM2SW_RELEASE ring from
>>>> cacheable memory").
>>>
>>> The combination of "buffers" and "swiotlb" sounds like Robin was right
>>> below.
>>>
>>>> The chain of events as best I can tell are:
>>>>
>>>> commit 6452f0a3d565 ("ath11k: allocate dst ring descriptors from
>>>> cacheable memory")
>>>> - Nov 12 2021 (made it into Linux 5.17)
>>>> - changes allocation of reo_dst rings to cacheable memory to allow
>>>> cached descriptor access to optimize CPU usage
>>>> - this is flawed because it uses virt_to_phys() to allocate cacheable
>>>> memory which does not work on systems with an IOMMU enabled or using
>>>> software IOMMU (swiotlb); this causes a kernel crash on client
>>>> association
>>>
>>> And this is where it started to take a wrong turn, that everyhing
>>> later basically made worse. If you have long living and potentially
>>> large DMA allocations, you need to use dma_alloc_* interfaces.
>>>
>>> 5.17 already had dma_alloc_pages for quite a while which was and still is
>>> the proper interface to use. For much older kernel you'd be stuck
>>> with dma_alloc_noncoherent or dma_alloc_attrs with the right flag,
>>> but even that would have been much better.
>>
>> Christoph,
>>
>> I'm not clear what you are suggesting be done here. Are you suggesting
>> that ath11k has been using the wrong mechanism by calling
>> dma_map_single for cached DMA buffers? I'm not all that familiar with
>> ath11k so I can't tell what buffers are considered long living.
>
> those buffers are allocated when driver load and freed when driver unload, so IMO they are
> long living.
The point is that if this driver wants a notion of "cached DMA buffers",
then it should allocate such buffers the proper way, not try to reinvent
it badly. That means using dma_alloc_pages(), or modern
dma_alloc_noncoherent() which is essentially the same thing but with the
dma_map_page() call automatically done for you as well.
Thanks,
Robin.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: ath11k swiotlb buffer is full (on IMX8M with 4GiB DRAM)
2024-12-11 13:44 ` Robin Murphy
@ 2024-12-12 5:24 ` Baochen Qiang
2024-12-12 7:18 ` Christoph Hellwig
2024-12-12 19:57 ` Tim Harvey
0 siblings, 2 replies; 17+ messages in thread
From: Baochen Qiang @ 2024-12-12 5:24 UTC (permalink / raw)
To: Robin Murphy, Tim Harvey, Christoph Hellwig
Cc: ath11k, linux-wireless, Fabio Estevam, Marek Szyprowski, iommu
On 12/11/2024 9:44 PM, Robin Murphy wrote:
> On 2024-12-11 2:31 am, Baochen Qiang wrote:
>>
>>
>> On 12/11/2024 7:06 AM, Tim Harvey wrote:
>>> On Mon, Dec 9, 2024 at 8:11 PM Christoph Hellwig <hch@lst.de> wrote:
>>>>
>>>> On Mon, Dec 09, 2024 at 11:15:02AM -0800, Tim Harvey wrote:
>>>>> After a lot of back and forth and investigation this is due to the
>>>>> IMX8M SoC's not having an IOMMU thus swiotlb is being used and ath11k
>>>>> is requesting some buffers that are too large for swiotlb to provide.
>>>>> There is a specific patch which added the HAL_WBM2SW_RELEASE buffers
>>>>> to cacheable memory that could be reverted to fix this but the concern
>>>>> was that it would impact performance moving those buffers to
>>>>> non-cacheable memory (there are three ~1MiB buffers being allocated):
>>>>> commit d0e2523bfa9cb ("ath11k: allocate HAL_WBM2SW_RELEASE ring from
>>>>> cacheable memory").
>>>>
>>>> The combination of "buffers" and "swiotlb" sounds like Robin was right
>>>> below.
>>>>
>>>>> The chain of events as best I can tell are:
>>>>>
>>>>> commit 6452f0a3d565 ("ath11k: allocate dst ring descriptors from
>>>>> cacheable memory")
>>>>> - Nov 12 2021 (made it into Linux 5.17)
>>>>> - changes allocation of reo_dst rings to cacheable memory to allow
>>>>> cached descriptor access to optimize CPU usage
>>>>> - this is flawed because it uses virt_to_phys() to allocate cacheable
>>>>> memory which does not work on systems with an IOMMU enabled or using
>>>>> software IOMMU (swiotlb); this causes a kernel crash on client
>>>>> association
>>>>
>>>> And this is where it started to take a wrong turn, that everyhing
>>>> later basically made worse. If you have long living and potentially
>>>> large DMA allocations, you need to use dma_alloc_* interfaces.
>>>>
>>>> 5.17 already had dma_alloc_pages for quite a while which was and still is
>>>> the proper interface to use. For much older kernel you'd be stuck
>>>> with dma_alloc_noncoherent or dma_alloc_attrs with the right flag,
>>>> but even that would have been much better.
>>>
>>> Christoph,
>>>
>>> I'm not clear what you are suggesting be done here. Are you suggesting
>>> that ath11k has been using the wrong mechanism by calling
>>> dma_map_single for cached DMA buffers? I'm not all that familiar with
>>> ath11k so I can't tell what buffers are considered long living.
>>
>> those buffers are allocated when driver load and freed when driver unload, so IMO they are
>> long living.
>
> The point is that if this driver wants a notion of "cached DMA buffers", then it should
> allocate such buffers the proper way, not try to reinvent it badly. That means using
> dma_alloc_pages(), or modern dma_alloc_noncoherent() which is essentially the same thing
> but with the dma_map_page() call automatically done for you as well.
yeah, you are right, Robin. didn't know there are convenient interfaces like these already.
Tim, can you work out a patch then?
>
> Thanks,
> Robin.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: ath11k swiotlb buffer is full (on IMX8M with 4GiB DRAM)
2024-12-12 5:24 ` Baochen Qiang
@ 2024-12-12 7:18 ` Christoph Hellwig
2024-12-12 18:08 ` Jeff Johnson
2024-12-12 19:57 ` Tim Harvey
1 sibling, 1 reply; 17+ messages in thread
From: Christoph Hellwig @ 2024-12-12 7:18 UTC (permalink / raw)
To: Baochen Qiang
Cc: Robin Murphy, Tim Harvey, Christoph Hellwig, ath11k,
linux-wireless, Fabio Estevam, Marek Szyprowski, iommu
On Thu, Dec 12, 2024 at 01:24:45PM +0800, Baochen Qiang wrote:
> yeah, you are right, Robin. didn't know there are convenient interfaces
> like these already.
FYI, the DMA-API documentation says pretty straight forward to not
use the streaming API and documents these interfaces. But maybe it's
not clear enough, so I'm open to suggestions on how to make it more
obvious.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: ath11k swiotlb buffer is full (on IMX8M with 4GiB DRAM)
2024-12-12 7:18 ` Christoph Hellwig
@ 2024-12-12 18:08 ` Jeff Johnson
0 siblings, 0 replies; 17+ messages in thread
From: Jeff Johnson @ 2024-12-12 18:08 UTC (permalink / raw)
To: Christoph Hellwig, Baochen Qiang
Cc: Robin Murphy, Tim Harvey, ath11k, linux-wireless, Fabio Estevam,
Marek Szyprowski, iommu
On 12/11/2024 11:18 PM, Christoph Hellwig wrote:
> On Thu, Dec 12, 2024 at 01:24:45PM +0800, Baochen Qiang wrote:
>> yeah, you are right, Robin. didn't know there are convenient interfaces
>> like these already.
>
> FYI, the DMA-API documentation says pretty straight forward to not
> use the streaming API and documents these interfaces. But maybe it's
> not clear enough, so I'm open to suggestions on how to make it more
> obvious.
This is probably less of a documentation issue and more of an issue with the
fact that each instance of the ath driver is based upon the prior instance,
and hence it inherits any usage of legacy APIs. So unless someone notices
there is a better API, or proactively adds the usage of the new API for us,
the ath drivers will continue to use legacy APIs.
So it is appreciated that light has been shed on this issue.
/jeff
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: ath11k swiotlb buffer is full (on IMX8M with 4GiB DRAM)
2024-12-12 5:24 ` Baochen Qiang
2024-12-12 7:18 ` Christoph Hellwig
@ 2024-12-12 19:57 ` Tim Harvey
1 sibling, 0 replies; 17+ messages in thread
From: Tim Harvey @ 2024-12-12 19:57 UTC (permalink / raw)
To: Baochen Qiang
Cc: Robin Murphy, Christoph Hellwig, ath11k, linux-wireless,
Fabio Estevam, Marek Szyprowski, iommu
On Wed, Dec 11, 2024 at 9:24 PM Baochen Qiang <quic_bqiang@quicinc.com> wrote:
>
>
>
> On 12/11/2024 9:44 PM, Robin Murphy wrote:
> > On 2024-12-11 2:31 am, Baochen Qiang wrote:
> >>
> >>
> >> On 12/11/2024 7:06 AM, Tim Harvey wrote:
> >>> On Mon, Dec 9, 2024 at 8:11 PM Christoph Hellwig <hch@lst.de> wrote:
> >>>>
> >>>> On Mon, Dec 09, 2024 at 11:15:02AM -0800, Tim Harvey wrote:
> >>>>> After a lot of back and forth and investigation this is due to the
> >>>>> IMX8M SoC's not having an IOMMU thus swiotlb is being used and ath11k
> >>>>> is requesting some buffers that are too large for swiotlb to provide.
> >>>>> There is a specific patch which added the HAL_WBM2SW_RELEASE buffers
> >>>>> to cacheable memory that could be reverted to fix this but the concern
> >>>>> was that it would impact performance moving those buffers to
> >>>>> non-cacheable memory (there are three ~1MiB buffers being allocated):
> >>>>> commit d0e2523bfa9cb ("ath11k: allocate HAL_WBM2SW_RELEASE ring from
> >>>>> cacheable memory").
> >>>>
> >>>> The combination of "buffers" and "swiotlb" sounds like Robin was right
> >>>> below.
> >>>>
> >>>>> The chain of events as best I can tell are:
> >>>>>
> >>>>> commit 6452f0a3d565 ("ath11k: allocate dst ring descriptors from
> >>>>> cacheable memory")
> >>>>> - Nov 12 2021 (made it into Linux 5.17)
> >>>>> - changes allocation of reo_dst rings to cacheable memory to allow
> >>>>> cached descriptor access to optimize CPU usage
> >>>>> - this is flawed because it uses virt_to_phys() to allocate cacheable
> >>>>> memory which does not work on systems with an IOMMU enabled or using
> >>>>> software IOMMU (swiotlb); this causes a kernel crash on client
> >>>>> association
> >>>>
> >>>> And this is where it started to take a wrong turn, that everyhing
> >>>> later basically made worse. If you have long living and potentially
> >>>> large DMA allocations, you need to use dma_alloc_* interfaces.
> >>>>
> >>>> 5.17 already had dma_alloc_pages for quite a while which was and still is
> >>>> the proper interface to use. For much older kernel you'd be stuck
> >>>> with dma_alloc_noncoherent or dma_alloc_attrs with the right flag,
> >>>> but even that would have been much better.
> >>>
> >>> Christoph,
> >>>
> >>> I'm not clear what you are suggesting be done here. Are you suggesting
> >>> that ath11k has been using the wrong mechanism by calling
> >>> dma_map_single for cached DMA buffers? I'm not all that familiar with
> >>> ath11k so I can't tell what buffers are considered long living.
> >>
> >> those buffers are allocated when driver load and freed when driver unload, so IMO they are
> >> long living.
> >
> > The point is that if this driver wants a notion of "cached DMA buffers", then it should
> > allocate such buffers the proper way, not try to reinvent it badly. That means using
> > dma_alloc_pages(), or modern dma_alloc_noncoherent() which is essentially the same thing
> > but with the dma_map_page() call automatically done for you as well.
>
> yeah, you are right, Robin. didn't know there are convenient interfaces like these already.
>
> Tim, can you work out a patch then?
>
How about:
diff --git a/drivers/net/wireless/ath/ath11k/dp.c
b/drivers/net/wireless/ath/ath11k/dp.c
index fbf666d0ecf1..557e06187e95 100644
--- a/drivers/net/wireless/ath/ath11k/dp.c
+++ b/drivers/net/wireless/ath/ath11k/dp.c
@@ -105,9 +105,8 @@ void ath11k_dp_srng_cleanup(struct ath11k_base
*ab, struct dp_srng *ring)
return;
if (ring->cached) {
- dma_unmap_single(ab->dev, ring->paddr_unaligned, ring->size,
- DMA_FROM_DEVICE);
- kfree(ring->vaddr_unaligned);
+ dma_free_noncoherent(ab->dev, ring->size, ring->vaddr_unaligned,
+ ring->paddr_unaligned, DMA_FROM_DEVICE);
} else {
dma_free_coherent(ab->dev, ring->size, ring->vaddr_unaligned,
ring->paddr_unaligned);
@@ -249,28 +248,18 @@ int ath11k_dp_srng_setup(struct ath11k_base *ab,
struct dp_srng *ring,
default:
cached = false;
}
-
- if (cached) {
- ring->vaddr_unaligned = kzalloc(ring->size, GFP_KERNEL);
- if (!ring->vaddr_unaligned)
- return -ENOMEM;
-
- ring->paddr_unaligned = dma_map_single(ab->dev,
-
ring->vaddr_unaligned,
- ring->size,
- DMA_FROM_DEVICE);
- if (dma_mapping_error(ab->dev, ring->paddr_unaligned)) {
- kfree(ring->vaddr_unaligned);
- ring->vaddr_unaligned = NULL;
- return -ENOMEM;
- }
- }
}
- if (!cached)
+ if (cached) {
+ ring->vaddr_unaligned = dma_alloc_noncoherent(ab->dev,
ring->size,
+
&ring->paddr_unaligned,
+ DMA_FROM_DEVICE,
+ GFP_KERNEL);
+ } else {
ring->vaddr_unaligned = dma_alloc_coherent(ab->dev, ring->size,
&ring->paddr_unaligned,
GFP_KERNEL);
+ }
if (!ring->vaddr_unaligned)
return -ENOMEM;
If this is what we are talking about here I can submit that with a
proper commit log. Note there are a lot of other calls to
dma_map_single in the ath drivers and my understanding is those may be
just fine for small short-lived buffers but I'm not clear if that is
what they are always used for.
Best Regards,
Tim
> >
> > Thanks,
> > Robin.
>
^ permalink raw reply related [flat|nested] 17+ messages in thread
end of thread, other threads:[~2024-12-12 19:58 UTC | newest]
Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-11-23 0:43 ath11k swiotlb buffer is full (on IMX8M with 4GiB DRAM) Tim Harvey
2024-11-25 7:23 ` Baochen Qiang
2024-11-25 18:02 ` Tim Harvey
2024-11-26 2:46 ` Baochen Qiang
2024-12-06 17:07 ` Tim Harvey
2024-12-09 6:39 ` Baochen Qiang
2024-12-09 8:17 ` Christoph Hellwig
2024-12-09 10:49 ` Robin Murphy
2024-12-09 19:15 ` Tim Harvey
2024-12-10 4:11 ` Christoph Hellwig
2024-12-10 23:06 ` Tim Harvey
2024-12-11 2:31 ` Baochen Qiang
2024-12-11 13:44 ` Robin Murphy
2024-12-12 5:24 ` Baochen Qiang
2024-12-12 7:18 ` Christoph Hellwig
2024-12-12 18:08 ` Jeff Johnson
2024-12-12 19:57 ` Tim Harvey
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).