* [PATCH] iommu/rockchip: fix page table allocation flags for v2 IOMMU
@ 2026-03-31 7:50 Midgy BALON
2026-03-31 7:57 ` Shawn Lin
` (2 more replies)
0 siblings, 3 replies; 8+ messages in thread
From: Midgy BALON @ 2026-03-31 7:50 UTC (permalink / raw)
To: iommu
Cc: joro, will, robin.murphy, heiko, jonas, linux-arm-kernel,
linux-rockchip, linux-kernel, stable, Midgy BALON
commit 2a7e6400f72b ("iommu: rockchip: Allocate tables from all
available memory for IOMMU v2") removed GFP_DMA32 from
iommu_data_ops_v2, reasoning that RK356x and RK3588 IOMMU v2 hardware
supports up to 40-bit physical addresses for page tables. However, the
RK3568 IOMMU page-table walker uses a 32-bit AXI bus: it cannot access
physical addresses above 4 GB regardless of the address encoding range.
On boards with more than 4 GB of RAM (e.g. 8 GB LPDDR4X), removing
GFP_DMA32 causes two distinct failure modes:
1. Direct allocation above 4 GB: iommu_alloc_pages_sz() may return
memory above 0x100000000. The hardware page-table walker issues a
bus error trying to dereference those addresses, causing an IOMMU
fault on the first DMA transaction.
2. SWIOTLB bounce-buffer poisoning: without GFP_DMA32, page tables land
above the SWIOTLB window. dma_map_single() with DMA_BIT_MASK(32)
then bounces them into a buffer below 4 GB. rk_dte_get_page_table()
returns phys_to_virt() of the bounce buffer address; PTEs are written
there; the next dma_sync_single_for_device(DMA_TO_DEVICE) copies the
original (zero) data back over the bounce buffer, silently erasing the
freshly written PTEs. The IOMMU faults because every PTE reads as zero.
Restore GFP_DMA32 (and DMA_BIT_MASK(32)) for iommu_data_ops_v2, which
currently only serves "rockchip,rk3568-iommu" in mainline.
Tested on Radxa ROCK 3B (RK3568, 8 GB LPDDR4X):
- MobileNetV1 via RKNN: 5.8 ms/inference (IOMMU mode)
- YOLOv5s 640x640 via RKNN: ~57 ms/inference (IOMMU mode)
- No IOMMU faults, correct inference results
Fixes: 2a7e6400f72b ("iommu: rockchip: Allocate tables from all available memory for IOMMU v2")
Cc: stable@vger.kernel.org
Cc: Jonas Karlman <jonas@kwiboo.se>
Signed-off-by: Midgy BALON <midgy971@gmail.com>
---
drivers/iommu/rockchip-iommu.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/iommu/rockchip-iommu.c b/drivers/iommu/rockchip-iommu.c
index 85f3667e797..8b45db29471 100644
--- a/drivers/iommu/rockchip-iommu.c
+++ b/drivers/iommu/rockchip-iommu.c
@@ -1358,8 +1358,8 @@ static struct rk_iommu_ops iommu_data_ops_v2 = {
.pt_address = &rk_dte_pt_address_v2,
.mk_dtentries = &rk_mk_dte_v2,
.mk_ptentries = &rk_mk_pte_v2,
- .dma_bit_mask = DMA_BIT_MASK(40),
- .gfp_flags = 0,
+ .dma_bit_mask = DMA_BIT_MASK(32),
+ .gfp_flags = GFP_DMA32,
};
static const struct of_device_id rk_iommu_dt_ids[] = {
--
2.30.2
^ permalink raw reply related [flat|nested] 8+ messages in thread* Re: [PATCH] iommu/rockchip: fix page table allocation flags for v2 IOMMU 2026-03-31 7:50 [PATCH] iommu/rockchip: fix page table allocation flags for v2 IOMMU Midgy BALON @ 2026-03-31 7:57 ` Shawn Lin 2026-03-31 18:13 ` Jonas Karlman 2026-04-01 7:48 ` Simon 2 siblings, 0 replies; 8+ messages in thread From: Shawn Lin @ 2026-03-31 7:57 UTC (permalink / raw) To: Midgy BALON Cc: shawn.lin, joro, will, robin.murphy, heiko, jonas, linux-arm-kernel, linux-rockchip, linux-kernel, stable, iommu, Simon Xue + Simon 在 2026/03/31 星期二 15:50, Midgy BALON 写道: > commit 2a7e6400f72b ("iommu: rockchip: Allocate tables from all > available memory for IOMMU v2") removed GFP_DMA32 from > iommu_data_ops_v2, reasoning that RK356x and RK3588 IOMMU v2 hardware > supports up to 40-bit physical addresses for page tables. However, the > RK3568 IOMMU page-table walker uses a 32-bit AXI bus: it cannot access > physical addresses above 4 GB regardless of the address encoding range. > > On boards with more than 4 GB of RAM (e.g. 8 GB LPDDR4X), removing > GFP_DMA32 causes two distinct failure modes: > > 1. Direct allocation above 4 GB: iommu_alloc_pages_sz() may return > memory above 0x100000000. The hardware page-table walker issues a > bus error trying to dereference those addresses, causing an IOMMU > fault on the first DMA transaction. > > 2. SWIOTLB bounce-buffer poisoning: without GFP_DMA32, page tables land > above the SWIOTLB window. dma_map_single() with DMA_BIT_MASK(32) > then bounces them into a buffer below 4 GB. rk_dte_get_page_table() > returns phys_to_virt() of the bounce buffer address; PTEs are written > there; the next dma_sync_single_for_device(DMA_TO_DEVICE) copies the > original (zero) data back over the bounce buffer, silently erasing the > freshly written PTEs. The IOMMU faults because every PTE reads as zero. > > Restore GFP_DMA32 (and DMA_BIT_MASK(32)) for iommu_data_ops_v2, which > currently only serves "rockchip,rk3568-iommu" in mainline. > > Tested on Radxa ROCK 3B (RK3568, 8 GB LPDDR4X): > - MobileNetV1 via RKNN: 5.8 ms/inference (IOMMU mode) > - YOLOv5s 640x640 via RKNN: ~57 ms/inference (IOMMU mode) > - No IOMMU faults, correct inference results > > Fixes: 2a7e6400f72b ("iommu: rockchip: Allocate tables from all available memory for IOMMU v2") > Cc: stable@vger.kernel.org > Cc: Jonas Karlman <jonas@kwiboo.se> > Signed-off-by: Midgy BALON <midgy971@gmail.com> > --- > drivers/iommu/rockchip-iommu.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/drivers/iommu/rockchip-iommu.c b/drivers/iommu/rockchip-iommu.c > index 85f3667e797..8b45db29471 100644 > --- a/drivers/iommu/rockchip-iommu.c > +++ b/drivers/iommu/rockchip-iommu.c > @@ -1358,8 +1358,8 @@ static struct rk_iommu_ops iommu_data_ops_v2 = { > .pt_address = &rk_dte_pt_address_v2, > .mk_dtentries = &rk_mk_dte_v2, > .mk_ptentries = &rk_mk_pte_v2, > - .dma_bit_mask = DMA_BIT_MASK(40), > - .gfp_flags = 0, > + .dma_bit_mask = DMA_BIT_MASK(32), > + .gfp_flags = GFP_DMA32, > }; > > static const struct of_device_id rk_iommu_dt_ids[] = { ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] iommu/rockchip: fix page table allocation flags for v2 IOMMU 2026-03-31 7:50 [PATCH] iommu/rockchip: fix page table allocation flags for v2 IOMMU Midgy BALON 2026-03-31 7:57 ` Shawn Lin @ 2026-03-31 18:13 ` Jonas Karlman 2026-04-01 7:48 ` Simon 2 siblings, 0 replies; 8+ messages in thread From: Jonas Karlman @ 2026-03-31 18:13 UTC (permalink / raw) To: Midgy BALON Cc: Shawn Lin, Simon Xue, iommu, joro, will, robin.murphy, heiko, linux-arm-kernel, linux-rockchip, linux-kernel, stable Hi Midgy, On 3/31/2026 9:50 AM, Midgy BALON wrote: > commit 2a7e6400f72b ("iommu: rockchip: Allocate tables from all > available memory for IOMMU v2") removed GFP_DMA32 from > iommu_data_ops_v2, reasoning that RK356x and RK3588 IOMMU v2 hardware > supports up to 40-bit physical addresses for page tables. However, the > RK3568 IOMMU page-table walker uses a 32-bit AXI bus: it cannot access > physical addresses above 4 GB regardless of the address encoding range. > > On boards with more than 4 GB of RAM (e.g. 8 GB LPDDR4X), removing > GFP_DMA32 causes two distinct failure modes: > > 1. Direct allocation above 4 GB: iommu_alloc_pages_sz() may return > memory above 0x100000000. The hardware page-table walker issues a > bus error trying to dereference those addresses, causing an IOMMU > fault on the first DMA transaction. > > 2. SWIOTLB bounce-buffer poisoning: without GFP_DMA32, page tables land > above the SWIOTLB window. dma_map_single() with DMA_BIT_MASK(32) > then bounces them into a buffer below 4 GB. rk_dte_get_page_table() > returns phys_to_virt() of the bounce buffer address; PTEs are written > there; the next dma_sync_single_for_device(DMA_TO_DEVICE) copies the > original (zero) data back over the bounce buffer, silently erasing the > freshly written PTEs. The IOMMU faults because every PTE reads as zero. > > Restore GFP_DMA32 (and DMA_BIT_MASK(32)) for iommu_data_ops_v2, which > currently only serves "rockchip,rk3568-iommu" in mainline. > > Tested on Radxa ROCK 3B (RK3568, 8 GB LPDDR4X): > - MobileNetV1 via RKNN: 5.8 ms/inference (IOMMU mode) > - YOLOv5s 640x640 via RKNN: ~57 ms/inference (IOMMU mode) > - No IOMMU faults, correct inference results > > Fixes: 2a7e6400f72b ("iommu: rockchip: Allocate tables from all available memory for IOMMU v2") > Cc: stable@vger.kernel.org > Cc: Jonas Karlman <jonas@kwiboo.se> > Signed-off-by: Midgy BALON <midgy971@gmail.com> > --- > drivers/iommu/rockchip-iommu.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/drivers/iommu/rockchip-iommu.c b/drivers/iommu/rockchip-iommu.c > index 85f3667e797..8b45db29471 100644 > --- a/drivers/iommu/rockchip-iommu.c > +++ b/drivers/iommu/rockchip-iommu.c > @@ -1358,8 +1358,8 @@ static struct rk_iommu_ops iommu_data_ops_v2 = { > .pt_address = &rk_dte_pt_address_v2, > .mk_dtentries = &rk_mk_dte_v2, > .mk_ptentries = &rk_mk_pte_v2, > - .dma_bit_mask = DMA_BIT_MASK(40), > - .gfp_flags = 0, > + .dma_bit_mask = DMA_BIT_MASK(32), > + .gfp_flags = GFP_DMA32, This change is wrong because this struct describe the RK IOMMU v2 that is capable of 40-bit addressing, used with e.g. RK3568 VOP2 MMU and MMUs in other RK35xx SoCs. What you have discovered is most likely that some IP blocks, e.g. NPU on RK3568, is not capable of >32-bit addressing, and/or that such IP blocks are still using IOMMU v1 blocks, or some variant with 32-bit limitation. However, the RK IOMMU driver is currently not capable of supporting different IOMMU revisions, if I recall correctly there may have been a patch trying to address that already on ML. Have you seen this issue with a variant of the rockit driver that add support for RK3568 or a variant of the downstream rknpu driver forward ported to mainline? If your findings are correct it is likely that the NPU MMU needs to use a different compatible, since rockchip,rk3568-iommu describe the IOMMUv2 that is capable of 40-bit addressing and is also used by other RK35xx SoCs. Regards, Jonas > }; > > static const struct of_device_id rk_iommu_dt_ids[] = { ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] iommu/rockchip: fix page table allocation flags for v2 IOMMU 2026-03-31 7:50 [PATCH] iommu/rockchip: fix page table allocation flags for v2 IOMMU Midgy BALON 2026-03-31 7:57 ` Shawn Lin 2026-03-31 18:13 ` Jonas Karlman @ 2026-04-01 7:48 ` Simon 2026-04-01 8:41 ` Jonas Karlman 2 siblings, 1 reply; 8+ messages in thread From: Simon @ 2026-04-01 7:48 UTC (permalink / raw) To: Midgy BALON, iommu Cc: joro, will, robin.murphy, heiko, jonas, linux-arm-kernel, linux-rockchip, linux-kernel, stable Hi Midgy, 在 2026/3/31 15:50, Midgy BALON 写道: > commit 2a7e6400f72b ("iommu: rockchip: Allocate tables from all > available memory for IOMMU v2") removed GFP_DMA32 from > iommu_data_ops_v2, reasoning that RK356x and RK3588 IOMMU v2 hardware > supports up to 40-bit physical addresses for page tables. However, the > RK3568 IOMMU page-table walker uses a 32-bit AXI bus: it cannot access > physical addresses above 4 GB regardless of the address encoding range. > > On boards with more than 4 GB of RAM (e.g. 8 GB LPDDR4X), removing > GFP_DMA32 causes two distinct failure modes: > > 1. Direct allocation above 4 GB: iommu_alloc_pages_sz() may return > memory above 0x100000000. The hardware page-table walker issues a > bus error trying to dereference those addresses, causing an IOMMU > fault on the first DMA transaction. Which IP block is hitting this? We'd like to take a look on our end. > 2. SWIOTLB bounce-buffer poisoning: without GFP_DMA32, page tables land > above the SWIOTLB window. dma_map_single() with DMA_BIT_MASK(32) > then bounces them into a buffer below 4 GB. rk_dte_get_page_table() > returns phys_to_virt() of the bounce buffer address; PTEs are written > there; the next dma_sync_single_for_device(DMA_TO_DEVICE) copies the > original (zero) data back over the bounce buffer, silently erasing the > freshly written PTEs. The IOMMU faults because every PTE reads as zero. This probably need a separate patch. One way to fix it would be to track the original L2 page table base addresses in struct rk_iommu_domain, then have rk_dte_get_page_table() return the tracked address instead of deriving it from the DTE. > Restore GFP_DMA32 (and DMA_BIT_MASK(32)) for iommu_data_ops_v2, which > currently only serves "rockchip,rk3568-iommu" in mainline. > > Tested on Radxa ROCK 3B (RK3568, 8 GB LPDDR4X): > - MobileNetV1 via RKNN: 5.8 ms/inference (IOMMU mode) > - YOLOv5s 640x640 via RKNN: ~57 ms/inference (IOMMU mode) > - No IOMMU faults, correct inference results > > Fixes: 2a7e6400f72b ("iommu: rockchip: Allocate tables from all available memory for IOMMU v2") > Cc: stable@vger.kernel.org > Cc: Jonas Karlman <jonas@kwiboo.se> > Signed-off-by: Midgy BALON <midgy971@gmail.com> > --- > drivers/iommu/rockchip-iommu.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/drivers/iommu/rockchip-iommu.c b/drivers/iommu/rockchip-iommu.c > index 85f3667e797..8b45db29471 100644 > --- a/drivers/iommu/rockchip-iommu.c > +++ b/drivers/iommu/rockchip-iommu.c > @@ -1358,8 +1358,8 @@ static struct rk_iommu_ops iommu_data_ops_v2 = { > .pt_address = &rk_dte_pt_address_v2, > .mk_dtentries = &rk_mk_dte_v2, > .mk_ptentries = &rk_mk_pte_v2, > - .dma_bit_mask = DMA_BIT_MASK(40), > - .gfp_flags = 0, > + .dma_bit_mask = DMA_BIT_MASK(32), > + .gfp_flags = GFP_DMA32, > }; > > static const struct of_device_id rk_iommu_dt_ids[] = { ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] iommu/rockchip: fix page table allocation flags for v2 IOMMU 2026-04-01 7:48 ` Simon @ 2026-04-01 8:41 ` Jonas Karlman 2026-04-01 10:22 ` Simon Xue 0 siblings, 1 reply; 8+ messages in thread From: Jonas Karlman @ 2026-04-01 8:41 UTC (permalink / raw) To: Simon, Midgy BALON Cc: iommu, joro, will, robin.murphy, heiko, linux-arm-kernel, linux-rockchip, linux-kernel, stable Hi Simon, On 4/1/2026 9:48 AM, Simon wrote: > Hi Midgy, > > 在 2026/3/31 15:50, Midgy BALON 写道: >> commit 2a7e6400f72b ("iommu: rockchip: Allocate tables from all >> available memory for IOMMU v2") removed GFP_DMA32 from >> iommu_data_ops_v2, reasoning that RK356x and RK3588 IOMMU v2 hardware >> supports up to 40-bit physical addresses for page tables. However, the >> RK3568 IOMMU page-table walker uses a 32-bit AXI bus: it cannot access >> physical addresses above 4 GB regardless of the address encoding range. >> >> On boards with more than 4 GB of RAM (e.g. 8 GB LPDDR4X), removing >> GFP_DMA32 causes two distinct failure modes: >> >> 1. Direct allocation above 4 GB: iommu_alloc_pages_sz() may return >> memory above 0x100000000. The hardware page-table walker issues a >> bus error trying to dereference those addresses, causing an IOMMU >> fault on the first DMA transaction. > > Which IP block is hitting this? We'd like to take a look on our end. I have seen reports that the NPU MMU on RK3568/RK3566 is having some issue using DTE/PTE with >32-bit addresses, maybe it uses a different MMU hw revision or has some hw errata? From my own testing at least the VOP2 MMU on RK3568 (and RK3588) was able to handle 40-bit addressable DTE/PTE, hence the original commit 2a7e6400f72b ("iommu: rockchip: Allocate tables from all available memory for IOMMU v2"). As also mentioned in my reply at [1], maybe the NPU MMU has some hw limitation or errata and may need to use a different compatible. [1] https://lore.kernel.org/r/3cd63b3d-1c5e-4a11-856e-c4aeb5d97d55@kwiboo.se/ Regards, Jonas > >> 2. SWIOTLB bounce-buffer poisoning: without GFP_DMA32, page tables land >> above the SWIOTLB window. dma_map_single() with DMA_BIT_MASK(32) >> then bounces them into a buffer below 4 GB. rk_dte_get_page_table() >> returns phys_to_virt() of the bounce buffer address; PTEs are written >> there; the next dma_sync_single_for_device(DMA_TO_DEVICE) copies the >> original (zero) data back over the bounce buffer, silently erasing the >> freshly written PTEs. The IOMMU faults because every PTE reads as zero. > > This probably need a separate patch. One way to fix it would be to track the > original L2 page table base addresses in struct rk_iommu_domain, > then have rk_dte_get_page_table() return the tracked address instead of > deriving it from the DTE. > >> Restore GFP_DMA32 (and DMA_BIT_MASK(32)) for iommu_data_ops_v2, which >> currently only serves "rockchip,rk3568-iommu" in mainline. >> >> Tested on Radxa ROCK 3B (RK3568, 8 GB LPDDR4X): >> - MobileNetV1 via RKNN: 5.8 ms/inference (IOMMU mode) >> - YOLOv5s 640x640 via RKNN: ~57 ms/inference (IOMMU mode) >> - No IOMMU faults, correct inference results >> >> Fixes: 2a7e6400f72b ("iommu: rockchip: Allocate tables from all available memory for IOMMU v2") >> Cc: stable@vger.kernel.org >> Cc: Jonas Karlman <jonas@kwiboo.se> >> Signed-off-by: Midgy BALON <midgy971@gmail.com> >> --- >> drivers/iommu/rockchip-iommu.c | 4 ++-- >> 1 file changed, 2 insertions(+), 2 deletions(-) >> >> diff --git a/drivers/iommu/rockchip-iommu.c b/drivers/iommu/rockchip-iommu.c >> index 85f3667e797..8b45db29471 100644 >> --- a/drivers/iommu/rockchip-iommu.c >> +++ b/drivers/iommu/rockchip-iommu.c >> @@ -1358,8 +1358,8 @@ static struct rk_iommu_ops iommu_data_ops_v2 = { >> .pt_address = &rk_dte_pt_address_v2, >> .mk_dtentries = &rk_mk_dte_v2, >> .mk_ptentries = &rk_mk_pte_v2, >> - .dma_bit_mask = DMA_BIT_MASK(40), >> - .gfp_flags = 0, >> + .dma_bit_mask = DMA_BIT_MASK(32), >> + .gfp_flags = GFP_DMA32, >> }; >> >> static const struct of_device_id rk_iommu_dt_ids[] = { ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] iommu/rockchip: fix page table allocation flags for v2 IOMMU 2026-04-01 8:41 ` Jonas Karlman @ 2026-04-01 10:22 ` Simon Xue 2026-04-03 4:40 ` Simon Xue 0 siblings, 1 reply; 8+ messages in thread From: Simon Xue @ 2026-04-01 10:22 UTC (permalink / raw) To: Jonas Karlman, Midgy BALON Cc: iommu, joro, will, robin.murphy, heiko, linux-arm-kernel, linux-rockchip, linux-kernel, stable Hi Jonas, 在 2026/4/1 16:41, Jonas Karlman 写道: > Hi Simon, > > On 4/1/2026 9:48 AM, Simon wrote: >> Hi Midgy, >> >> 在 2026/3/31 15:50, Midgy BALON 写道: >>> commit 2a7e6400f72b ("iommu: rockchip: Allocate tables from all >>> available memory for IOMMU v2") removed GFP_DMA32 from >>> iommu_data_ops_v2, reasoning that RK356x and RK3588 IOMMU v2 hardware >>> supports up to 40-bit physical addresses for page tables. However, the >>> RK3568 IOMMU page-table walker uses a 32-bit AXI bus: it cannot access >>> physical addresses above 4 GB regardless of the address encoding range. >>> >>> On boards with more than 4 GB of RAM (e.g. 8 GB LPDDR4X), removing >>> GFP_DMA32 causes two distinct failure modes: >>> >>> 1. Direct allocation above 4 GB: iommu_alloc_pages_sz() may return >>> memory above 0x100000000. The hardware page-table walker issues a >>> bus error trying to dereference those addresses, causing an IOMMU >>> fault on the first DMA transaction. >> Which IP block is hitting this? We'd like to take a look on our end. > I have seen reports that the NPU MMU on RK3568/RK3566 is having some > issue using DTE/PTE with >32-bit addresses, maybe it uses a different > MMU hw revision or has some hw errata? > > From my own testing at least the VOP2 MMU on RK3568 (and RK3588) was > able to handle 40-bit addressable DTE/PTE, hence the original commit > 2a7e6400f72b ("iommu: rockchip: Allocate tables from all available > memory for IOMMU v2"). > > As also mentioned in my reply at [1], maybe the NPU MMU has some hw > limitation or errata and may need to use a different compatible. Yes, We are checking internally whether different IOMMU versions integrated. I will share what we find once we have results. > [1] https://lore.kernel.org/r/3cd63b3d-1c5e-4a11-856e-c4aeb5d97d55@kwiboo.se/ > > Regards, > Jonas > >>> 2. SWIOTLB bounce-buffer poisoning: without GFP_DMA32, page tables land >>> above the SWIOTLB window. dma_map_single() with DMA_BIT_MASK(32) >>> then bounces them into a buffer below 4 GB. rk_dte_get_page_table() >>> returns phys_to_virt() of the bounce buffer address; PTEs are written >>> there; the next dma_sync_single_for_device(DMA_TO_DEVICE) copies the >>> original (zero) data back over the bounce buffer, silently erasing the >>> freshly written PTEs. The IOMMU faults because every PTE reads as zero. >> This probably need a separate patch. One way to fix it would be to track the >> original L2 page table base addresses in struct rk_iommu_domain, >> then have rk_dte_get_page_table() return the tracked address instead of >> deriving it from the DTE. >> >>> Restore GFP_DMA32 (and DMA_BIT_MASK(32)) for iommu_data_ops_v2, which >>> currently only serves "rockchip,rk3568-iommu" in mainline. >>> >>> Tested on Radxa ROCK 3B (RK3568, 8 GB LPDDR4X): >>> - MobileNetV1 via RKNN: 5.8 ms/inference (IOMMU mode) >>> - YOLOv5s 640x640 via RKNN: ~57 ms/inference (IOMMU mode) >>> - No IOMMU faults, correct inference results >>> >>> Fixes: 2a7e6400f72b ("iommu: rockchip: Allocate tables from all available memory for IOMMU v2") >>> Cc: stable@vger.kernel.org >>> Cc: Jonas Karlman <jonas@kwiboo.se> >>> Signed-off-by: Midgy BALON <midgy971@gmail.com> >>> --- >>> drivers/iommu/rockchip-iommu.c | 4 ++-- >>> 1 file changed, 2 insertions(+), 2 deletions(-) >>> >>> diff --git a/drivers/iommu/rockchip-iommu.c b/drivers/iommu/rockchip-iommu.c >>> index 85f3667e797..8b45db29471 100644 >>> --- a/drivers/iommu/rockchip-iommu.c >>> +++ b/drivers/iommu/rockchip-iommu.c >>> @@ -1358,8 +1358,8 @@ static struct rk_iommu_ops iommu_data_ops_v2 = { >>> .pt_address = &rk_dte_pt_address_v2, >>> .mk_dtentries = &rk_mk_dte_v2, >>> .mk_ptentries = &rk_mk_pte_v2, >>> - .dma_bit_mask = DMA_BIT_MASK(40), >>> - .gfp_flags = 0, >>> + .dma_bit_mask = DMA_BIT_MASK(32), >>> + .gfp_flags = GFP_DMA32, >>> }; >>> >>> static const struct of_device_id rk_iommu_dt_ids[] = { > ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] iommu/rockchip: fix page table allocation flags for v2 IOMMU 2026-04-01 10:22 ` Simon Xue @ 2026-04-03 4:40 ` Simon Xue 2026-04-03 14:02 ` Midgy Balon 0 siblings, 1 reply; 8+ messages in thread From: Simon Xue @ 2026-04-03 4:40 UTC (permalink / raw) To: Jonas Karlman, Midgy BALON Cc: iommu, joro, will, robin.murphy, heiko, linux-arm-kernel, linux-rockchip, linux-kernel, stable 在 2026/4/1 18:22, Simon Xue 写道: > Hi Jonas, > > 在 2026/4/1 16:41, Jonas Karlman 写道: >> Hi Simon, >> >> On 4/1/2026 9:48 AM, Simon wrote: >>> Hi Midgy, >>> >>> 在 2026/3/31 15:50, Midgy BALON 写道: >>>> commit 2a7e6400f72b ("iommu: rockchip: Allocate tables from all >>>> available memory for IOMMU v2") removed GFP_DMA32 from >>>> iommu_data_ops_v2, reasoning that RK356x and RK3588 IOMMU v2 hardware >>>> supports up to 40-bit physical addresses for page tables. However, the >>>> RK3568 IOMMU page-table walker uses a 32-bit AXI bus: it cannot access >>>> physical addresses above 4 GB regardless of the address encoding >>>> range. >>>> >>>> On boards with more than 4 GB of RAM (e.g. 8 GB LPDDR4X), removing >>>> GFP_DMA32 causes two distinct failure modes: >>>> >>>> 1. Direct allocation above 4 GB: iommu_alloc_pages_sz() may return >>>> memory above 0x100000000. The hardware page-table walker >>>> issues a >>>> bus error trying to dereference those addresses, causing an IOMMU >>>> fault on the first DMA transaction. >>> Which IP block is hitting this? We'd like to take a look on our end. >> I have seen reports that the NPU MMU on RK3568/RK3566 is having some >> issue using DTE/PTE with >32-bit addresses, maybe it uses a different >> MMU hw revision or has some hw errata? >> >> From my own testing at least the VOP2 MMU on RK3568 (and RK3588) was >> able to handle 40-bit addressable DTE/PTE, hence the original commit >> 2a7e6400f72b ("iommu: rockchip: Allocate tables from all available >> memory for IOMMU v2"). >> >> As also mentioned in my reply at [1], maybe the NPU MMU has some hw >> limitation or errata and may need to use a different compatible. > > Yes, We are checking internally whether different IOMMU versions > integrated. > > I will share what we find once we have results. > We internally checked that the RK356x SoCs integrate two different IOMMU versions (v1.0 and v2.0), like NPU and ISP use the v1.0 IOMMU. Both versions can map 40-bit physical pages, but v1.0 does not support placing the first-level page table above 4 GB. To fix this, I think we need to land this patch first: https://lore.kernel.org/all/20260310105303.128859-1-xxm@rock-chips.com/ Then on top of that, we can add a new compatible string to distinguish the IOMMU versions. >> [1] >> https://lore.kernel.org/r/3cd63b3d-1c5e-4a11-856e-c4aeb5d97d55@kwiboo.se/ >> >> Regards, >> Jonas >> >>>> 2. SWIOTLB bounce-buffer poisoning: without GFP_DMA32, page tables >>>> land >>>> above the SWIOTLB window. dma_map_single() with DMA_BIT_MASK(32) >>>> then bounces them into a buffer below 4 GB. >>>> rk_dte_get_page_table() >>>> returns phys_to_virt() of the bounce buffer address; PTEs are >>>> written >>>> there; the next dma_sync_single_for_device(DMA_TO_DEVICE) >>>> copies the >>>> original (zero) data back over the bounce buffer, silently >>>> erasing the >>>> freshly written PTEs. The IOMMU faults because every PTE >>>> reads as zero. >>> This probably need a separate patch. One way to fix it would be to >>> track the >>> original L2 page table base addresses in struct rk_iommu_domain, >>> then have rk_dte_get_page_table() return the tracked address instead of >>> deriving it from the DTE. >>> >>>> Restore GFP_DMA32 (and DMA_BIT_MASK(32)) for iommu_data_ops_v2, which >>>> currently only serves "rockchip,rk3568-iommu" in mainline. >>>> >>>> Tested on Radxa ROCK 3B (RK3568, 8 GB LPDDR4X): >>>> - MobileNetV1 via RKNN: 5.8 ms/inference (IOMMU mode) >>>> - YOLOv5s 640x640 via RKNN: ~57 ms/inference (IOMMU mode) >>>> - No IOMMU faults, correct inference results >>>> >>>> Fixes: 2a7e6400f72b ("iommu: rockchip: Allocate tables from all >>>> available memory for IOMMU v2") >>>> Cc: stable@vger.kernel.org >>>> Cc: Jonas Karlman <jonas@kwiboo.se> >>>> Signed-off-by: Midgy BALON <midgy971@gmail.com> >>>> --- >>>> drivers/iommu/rockchip-iommu.c | 4 ++-- >>>> 1 file changed, 2 insertions(+), 2 deletions(-) >>>> >>>> diff --git a/drivers/iommu/rockchip-iommu.c >>>> b/drivers/iommu/rockchip-iommu.c >>>> index 85f3667e797..8b45db29471 100644 >>>> --- a/drivers/iommu/rockchip-iommu.c >>>> +++ b/drivers/iommu/rockchip-iommu.c >>>> @@ -1358,8 +1358,8 @@ static struct rk_iommu_ops iommu_data_ops_v2 = { >>>> .pt_address = &rk_dte_pt_address_v2, >>>> .mk_dtentries = &rk_mk_dte_v2, >>>> .mk_ptentries = &rk_mk_pte_v2, >>>> - .dma_bit_mask = DMA_BIT_MASK(40), >>>> - .gfp_flags = 0, >>>> + .dma_bit_mask = DMA_BIT_MASK(32), >>>> + .gfp_flags = GFP_DMA32, >>>> }; >>>> static const struct of_device_id rk_iommu_dt_ids[] = { >> ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] iommu/rockchip: fix page table allocation flags for v2 IOMMU 2026-04-03 4:40 ` Simon Xue @ 2026-04-03 14:02 ` Midgy Balon 0 siblings, 0 replies; 8+ messages in thread From: Midgy Balon @ 2026-04-03 14:02 UTC (permalink / raw) To: Simon Xue Cc: Jonas Karlman, iommu, joro, will, robin.murphy, Heiko Stuebner, linux-arm-kernel, linux-rockchip, linux-kernel, stable From: Midgy BALON <midgy971@gmail.com> To: Simon Xue <xxm@rock-chips.com> Cc: Jonas Karlman <jonas@kwiboo.se>, iommu@lists.linux.dev, joro@8bytes.org, will@kernel.org, robin.murphy@arm.com, heiko@sntech.de, linux-arm-kernel@lists.infradead.org, linux-rockchip@lists.infradead.org, linux-kernel@vger.kernel.org, stable@vger.kernel.org In-Reply-To: <5663593b-2c53-4632-ad2c-db9efa8e9ab2@rock-chips.com> References: <20260331075010.1463-1-midgy971@gmail.com> <0f285782-b12a-4abd-bca7-b6c549bed59f@rock-chips.com> <e622cc9e-8fb0-454a-b88e-dc13cf2ff507@kwiboo.se> <89ed223d-1a2c-447d-9f21-76969e668855@rock-chips.com> <5663593b-2c53-4632-ad2c-db9efa8e9ab2@rock-chips.com> Subject: Re: [PATCH] iommu/rockchip: fix page table allocation flags for v2 IOMMU On 4/3/2026, Simon Xue wrote: > We internally checked that the RK356x SoCs integrate two different > IOMMU versions (v1.0 and v2.0), like NPU and ISP use the v1.0 IOMMU. > > Both versions can map 40-bit physical pages, but v1.0 does not support > placing the first-level page table above 4 GB. > > To fix this, I think we need to land this patch first: > https://lore.kernel.org/all/20260310105303.128859-1-xxm@rock-chips.com/ > > Then on top of that, we can add a new compatible string to distinguish > the IOMMU versions. Thank you Simon and Jonas for the internal investigation. This explains exactly what I observed. To answer Simon's earlier question: the IP block hitting both failure modes is the NPU IOMMU (rknpu_mmu, at 0xfde4b000), currently bound to "rockchip,rk3568-iommu" in rk356x-base.dtsi. Both the downstream rknpu driver and the upstream Rocket accel driver (drivers/accel/rocket/) use this IOMMU. The v1.0 first-level page table constraint explains both failure modes. On boards with more than 4 GB of RAM the DTE table can be allocated above 0x100000000, and the v1.0 hardware silently truncates or errors on that address. The SWIOTLB bounce-buffer path is a consequence of the same root cause: with DMA_BIT_MASK(32) on the NPU device, bounce buffers land below 4 GB, phys_to_virt() of the bounce address is then used as the PTE write target, and the subsequent dma_sync_single_for_device(DMA_TO_DEVICE) overwrites those PTEs with zeros from the original buffer. Please consider my original patch withdrawn. Modifying iommu_data_ops_v2 was too broad and would have incorrectly constrained VOP2 MMU and all other v2 IOMMU users. I agree fully with the two-step approach. On top of your per-device-ops patch [1], I plan to send: [1/2] iommu/rockchip: Add "rockchip,rk3568-iommu-v1" compatible for IOMMU v1.0 blocks (NPU, ISP/VICAP) on RK3568 — ops with .gfp_flags = GFP_DMA32, .dma_bit_mask = DMA_BIT_MASK(40) (v1.0 can still map 40-bit physical pages; only the DTE table base must be below 4 GB) [2/2] arm64: dts: rockchip: rk356x: Use "rockchip,rk3568-iommu-v1" for rknpu_mmu (0xfde4b000) and vicap_mmu (0xfdfe0800) One note on the SWIOTLB issue: with GFP_DMA32 in the new ops, page table allocations never reach SWIOTLB, so the "track L2 base addresses" approach you suggested should not be necessary — GFP_DMA32 prevents the bounce-buffer poisoning at the source. Happy to be corrected if there is another path where it is still needed. I am happy to add Tested-by to your per-device-ops patch [1]. [1] https://lore.kernel.org/all/20260310105303.128859-1-xxm@rock-chips.com/ Regards, Midgy BALON Le ven. 3 avr. 2026 à 06:40, Simon Xue <xxm@rock-chips.com> a écrit : > > > 在 2026/4/1 18:22, Simon Xue 写道: > > Hi Jonas, > > > > 在 2026/4/1 16:41, Jonas Karlman 写道: > >> Hi Simon, > >> > >> On 4/1/2026 9:48 AM, Simon wrote: > >>> Hi Midgy, > >>> > >>> 在 2026/3/31 15:50, Midgy BALON 写道: > >>>> commit 2a7e6400f72b ("iommu: rockchip: Allocate tables from all > >>>> available memory for IOMMU v2") removed GFP_DMA32 from > >>>> iommu_data_ops_v2, reasoning that RK356x and RK3588 IOMMU v2 hardware > >>>> supports up to 40-bit physical addresses for page tables. However, the > >>>> RK3568 IOMMU page-table walker uses a 32-bit AXI bus: it cannot access > >>>> physical addresses above 4 GB regardless of the address encoding > >>>> range. > >>>> > >>>> On boards with more than 4 GB of RAM (e.g. 8 GB LPDDR4X), removing > >>>> GFP_DMA32 causes two distinct failure modes: > >>>> > >>>> 1. Direct allocation above 4 GB: iommu_alloc_pages_sz() may return > >>>> memory above 0x100000000. The hardware page-table walker > >>>> issues a > >>>> bus error trying to dereference those addresses, causing an IOMMU > >>>> fault on the first DMA transaction. > >>> Which IP block is hitting this? We'd like to take a look on our end. > >> I have seen reports that the NPU MMU on RK3568/RK3566 is having some > >> issue using DTE/PTE with >32-bit addresses, maybe it uses a different > >> MMU hw revision or has some hw errata? > >> > >> From my own testing at least the VOP2 MMU on RK3568 (and RK3588) was > >> able to handle 40-bit addressable DTE/PTE, hence the original commit > >> 2a7e6400f72b ("iommu: rockchip: Allocate tables from all available > >> memory for IOMMU v2"). > >> > >> As also mentioned in my reply at [1], maybe the NPU MMU has some hw > >> limitation or errata and may need to use a different compatible. > > > > Yes, We are checking internally whether different IOMMU versions > > integrated. > > > > I will share what we find once we have results. > > > We internally checked that the RK356x SoCs integrate two different IOMMU > versions (v1.0 and v2.0), like NPU and ISP use the v1.0 IOMMU. > > Both versions can map 40-bit physical pages, but v1.0 does not support > placing the first-level page table above 4 GB. > > To fix this, I think we need to land this patch first: > https://lore.kernel.org/all/20260310105303.128859-1-xxm@rock-chips.com/ > > Then on top of that, we can add a new compatible string to distinguish > the IOMMU versions. > > >> [1] > >> https://lore.kernel.org/r/3cd63b3d-1c5e-4a11-856e-c4aeb5d97d55@kwiboo.se/ > >> > >> Regards, > >> Jonas > >> > >>>> 2. SWIOTLB bounce-buffer poisoning: without GFP_DMA32, page tables > >>>> land > >>>> above the SWIOTLB window. dma_map_single() with DMA_BIT_MASK(32) > >>>> then bounces them into a buffer below 4 GB. > >>>> rk_dte_get_page_table() > >>>> returns phys_to_virt() of the bounce buffer address; PTEs are > >>>> written > >>>> there; the next dma_sync_single_for_device(DMA_TO_DEVICE) > >>>> copies the > >>>> original (zero) data back over the bounce buffer, silently > >>>> erasing the > >>>> freshly written PTEs. The IOMMU faults because every PTE > >>>> reads as zero. > >>> This probably need a separate patch. One way to fix it would be to > >>> track the > >>> original L2 page table base addresses in struct rk_iommu_domain, > >>> then have rk_dte_get_page_table() return the tracked address instead of > >>> deriving it from the DTE. > >>> > >>>> Restore GFP_DMA32 (and DMA_BIT_MASK(32)) for iommu_data_ops_v2, which > >>>> currently only serves "rockchip,rk3568-iommu" in mainline. > >>>> > >>>> Tested on Radxa ROCK 3B (RK3568, 8 GB LPDDR4X): > >>>> - MobileNetV1 via RKNN: 5.8 ms/inference (IOMMU mode) > >>>> - YOLOv5s 640x640 via RKNN: ~57 ms/inference (IOMMU mode) > >>>> - No IOMMU faults, correct inference results > >>>> > >>>> Fixes: 2a7e6400f72b ("iommu: rockchip: Allocate tables from all > >>>> available memory for IOMMU v2") > >>>> Cc: stable@vger.kernel.org > >>>> Cc: Jonas Karlman <jonas@kwiboo.se> > >>>> Signed-off-by: Midgy BALON <midgy971@gmail.com> > >>>> --- > >>>> drivers/iommu/rockchip-iommu.c | 4 ++-- > >>>> 1 file changed, 2 insertions(+), 2 deletions(-) > >>>> > >>>> diff --git a/drivers/iommu/rockchip-iommu.c > >>>> b/drivers/iommu/rockchip-iommu.c > >>>> index 85f3667e797..8b45db29471 100644 > >>>> --- a/drivers/iommu/rockchip-iommu.c > >>>> +++ b/drivers/iommu/rockchip-iommu.c > >>>> @@ -1358,8 +1358,8 @@ static struct rk_iommu_ops iommu_data_ops_v2 = { > >>>> .pt_address = &rk_dte_pt_address_v2, > >>>> .mk_dtentries = &rk_mk_dte_v2, > >>>> .mk_ptentries = &rk_mk_pte_v2, > >>>> - .dma_bit_mask = DMA_BIT_MASK(40), > >>>> - .gfp_flags = 0, > >>>> + .dma_bit_mask = DMA_BIT_MASK(32), > >>>> + .gfp_flags = GFP_DMA32, > >>>> }; > >>>> static const struct of_device_id rk_iommu_dt_ids[] = { > >> ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2026-04-03 14:00 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2026-03-31 7:50 [PATCH] iommu/rockchip: fix page table allocation flags for v2 IOMMU Midgy BALON 2026-03-31 7:57 ` Shawn Lin 2026-03-31 18:13 ` Jonas Karlman 2026-04-01 7:48 ` Simon 2026-04-01 8:41 ` Jonas Karlman 2026-04-01 10:22 ` Simon Xue 2026-04-03 4:40 ` Simon Xue 2026-04-03 14:02 ` Midgy Balon
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox